Optimization over the set of matrices $X$ that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as the canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods that require a fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to random estimates of $B$. Our method does not enforce the constraint in every iteration; instead, it produces iterations that converge to critical points on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian optimization counterparts that require the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and the GEVP.
翻译:在满足$X^\top B X = I_p$的矩阵集合(称为广义Stiefel流形)上的优化问题广泛存在于涉及样本协方差矩阵的诸多应用中,例如典型相关分析(CCA)、独立成分分析(ICA)和广义特征值问题(GEVP)。解决这些问题通常需要采用迭代方法,且要求完整的$B$矩阵已知。本文提出一种低成本随机迭代方法,该方法仅需访问$B$的随机估计即可求解优化问题。我们的方法并不在每次迭代中强制满足约束条件,而是生成在期望意义上收敛到广义Stiefel流形临界点的迭代序列。该方法具有更低的单次迭代成本,仅需矩阵乘法运算,且与需要完整矩阵$B$的黎曼优化方法具有相同的收敛速率。实验证明了该方法在涉及广义正交约束的多种机器学习应用(包括CCA、ICA和GEVP)中的有效性。