Robust PCA 学习笔记(二)

2014-11-24 08:08:55 · 作者: · 浏览: 2
radient, Accelerated Proximal Gradient (APG)), the computational bottleneck ofeach iteration is a SVD computation. 2ndapproach is to formulate and solve the dual problem, and retrieve the primalsolution from the dual optimal solution. The dual problem too RPCA canbe written as:
maxYtrace(DTY) , subject to J(Y) ≤ 1
where J(Y) = max(||Y||2,λ-1||Y||∞). ||A||x means the x norm of A.(无穷范数表示矩阵中绝对值最大的一个)。This dual problem can besolved by constrained steepest ascent.
Now let’s talk about Augmented Lagrange Multiplier (ALM) and Alternating Directions Method (ADM) [2,4].
7.1. General method of ALM
For the optimizationproblem
min f(X), subj. h(X)=0 (3)
we can define the Lagrangefunction:
L(X,Y,μ) = f(X)++μ/2||h(X)|| F2 (4)
where Y is a Lagrangemultiplier and μ is a positive scalar.
General method of ALM is:
A genetic Lagrangemultiplier algorithm would solve PCP(principle component pursuit) by repeatedlysetting (Xk) = arg min L(Xk,Yk,μ), and then the Lagrangemultiplier matrix via Yk+1=Yk+μ(hk(X))
7.2 ALM algorithm for RPCA
In RPCA, we can define (5)
X = (A,E), f(x) = ||A||* + λ||E||1, andh(X) = D-A-E
Then the Lagrange functionis (6)
L(A,E,Y, μ) = ||A||* + λ||E||1+ +μ/2·|| D-A-E || F 2
The Optimization Flow isjust like the general ALM method. The initialization Y= Y0* isinspired by the dual problem(对偶问题) as it is likely to make the objective function value reasonably large.
Theorem 1. For algorithm 4, any accumulation point (A*,E*) of (Ak*, Ek*)is an optimal solution to the RPCA problemandthe convergence rate is at least O(μk-1).[5]
Inthis RPCA algorithm, a iteration strategy is adopted. As the optimizationprocess might be confused, we impose 2 well-known facts: (7)(8)
Sε[W] = arg minX ε||X||1+ ||X-W||F2 (7)
U Sε[W] VT = arg minX ε||X||*+ ||X-W||F2 (8)
which is used in the abovealgorithm for optimization one parameter while fixing another. In thesolutions, Sε[W] is the soft thresholding operator. Here I will impose the problemto speculate this. (To facilitate writing formulation and ploting, I use amanuscript. )
Now we utilize formulation (7,8) into RPCA problem.
For the objective function (6) w.r.t get optimal E, wecan rewrite the objective function by deleting unrelated component into:
f(E) = λ||E||1+ +μ/2·|| D-A-E|| F 2
=λ||E||1+ +μ/2·||D-A-E || F 2+(μ/2)||μ-1Y||2 //add an irrelevant item w.r.t E
=λ||E||1+(μ/2)(2(μ-1Y· (D-A-E))+|| D-A-E || F 2+||μ-1Y||2) //try to transform into (7)’s form
=(λ/μ)||E||1+ ||E-(D-A-μ-1Y)||F2
Finally we get the form of (7) and in the optimizationstep of E, we have
E = Sλ/μ[D-A-μ-1Y]
,same as what mentioned in algorithm 4.
Similarly, for matrices X, we can prove A=US1/μ(D-E-μ-1Y)V is theoptimization process of A.
8. Reference:
1) E. J. Candes and B. Recht. Exact Matrix Completion Via ConvexOptimization. Submitted for publication, 2008.
2) E. J. Candes, X. Li, Y. Ma, and J. Wright. Robust PrincipalComponent Analysis Submitted for publication, 2009.
3) Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robustprincipal component analysis: Exact recovery of corrupted low-rank matrices viaconvex optimization. In: NIPS 2009.
4) X. Yuan and J. Yang. Sparse and low-rank matrix decompositionvia alternating direction methods. preprint, 2009.
5) Z. Lin, M. Chen, L. Wu, and Y. Ma. The augmented L