Rachel Zhang
1. RPCA Brief Introduction
1. Why use Robust PCA
Solve the problem withspike noise with high magnitude instead of Gaussian distributed noise.
2. Main Problem
Given C = A*+B*, where A*is a sparse spike noise matrix and B* is a Low-rank matrix, aiming at recoveringB*.
B*= UΣV’, in which U∈Rn*k ,Σ∈Rk*k ,V∈Rn*k
3. Difference from PCA
Both PCA and Robust PCAaims at Matrix decomposition, However,
In PCA, M = L0+N0, L0:low rank matrix ; N0: small idd Gaussian noise matrix,it seeks the best rank-k estimation of L0 by minimizing ||M-L||2 subjectto rank(L)<=k. This problem can be solved by SVD.
In RPCA, M = L0+S0, L0:low rank matrix ; S0: a sparse spikes noise matrix, we’lltry to give the solution in the following sections.
2. Conditionsfor correct decomposition
4. Ill-posed problem:
Suppose sparse matrix A* and B*=eiejTare the solution of this decomposition problem.
1) With the assumption that B* is not only low rank but alsosparse, another valid sparse-plus low-rank decomposition might be A1= A*+ eiejT andB1 = 0, Therefore, we need an appropriatenotion of low-rank that ensures that B* is not too sparse. Conditionswill be imposed later that require the space spanned by singular vectors U andV (i.e., the row and column spaces of B*) to be “incoherent” with the standardbasis.
2) Similarly, with the assumption that A* is sparse as well aslow-rank (e.g. the first column of A* is non-zero, while all other columns are0, then A* has rank 1 and sparse). Another valid decomposition might be A2=0,B2 = A*+B* (Here rank(B2) <= rank(B*) + 1). Thus weneed the limitation that sparse matrix should beassumed not to be low rank. I.e., assume each row/column does not havetoo many non-zero entries (don’t exist dense row/column), to avoid such issues.
5. Conditions for exact recovery / decomposition:
If A* and B* are drawnfrom these classes, then we have exact recovery with high probability [1].
1) For low rank matrix L---Random orthogonal model [Candes andRecht 2008]:
A rank-k matrix B* with SVD B*=UΣV’ is constructed in this way: Thesingular vectors U,V∈Rn*k are drawnuniformly at random from the collection of rank-k partial isometries(等距算子) inRn*k. The singular vectors in U and V need not be mutuallyindependent. No restriction is placed on the singular values.
2) For sparse matrix S---Random sparsity model:
The matrix A* such that support(A*) is chosen uniformlyat random from the collection of all support sets of size m. There is noassumption made about the values of A* at locations specified by support(A*).
[Support(M)]: thelocations of the non-zero entries in M
Latest [2] improved on the conditions andyields the ‘best’ condition.
3. Recovery Algorithms
6. Formulization
For decomposition D = A+E,in which A is low rank and error E is sparse.
1) Intuitively propose
minrank(A)+γ||E||0, (1)
However, it is non-convex thus intractable (both of these 2are NP-hard to approximate).
2) Relax L0-norm to L1-norm and replace rank with nuclear norm
min||A||* + λ||E||1,
where||A||* =Σiσi(A) (2)
This is convex, i.e., exist a uniquely minimizer.
Reason: This relaxation ismotivated by observing that ||A||* + λ||E||1 is theconvex envelop (凸包) of rank(A)+γ||E||0 over the set of (A,E) suchthat max(||A||2,2,||E||1, ∞)≤1.
Moreover, there might be circumstancesunder which (2) perfectly recovers low-rank matrix A0.[3] shows itis indeed true under surprising broad conditions.
7. Approach RPCA Optimization Algorithm
We approach in twodifferent ways. 1st approach, use afirst-order method to solve the primal problem directly. (E.g. ProximalG