Optimization over the set of matrices that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods, such as Riemannian approaches, which require a computationally expensive eigenvalue decomposition involving fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimate of the feasible set. Our method does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to a critical point on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian counterparts involving the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and GEVP.
翻译:在满足$X^\top B X = I_p$的矩阵集合(称为广义Stiefel流形)上进行优化,常见于涉及样本协方差矩阵的多种应用中,例如典型相关分析(CCA)、独立成分分析(ICA)和广义特征值问题(GEVP)。求解此类问题通常采用迭代方法(如黎曼方法),这些方法需要计算昂贵的特征值分解,且需使用完整矩阵$B$。本文提出一种廉价的随机迭代方法,该方法仅需对可行集进行随机估计即可求解优化问题。该方法不强制在每次迭代中严格满足约束条件,而是通过迭代收敛到按期望定义的广义Stiefel流形上的临界点。该方法单次迭代成本较低,仅需矩阵乘法运算,且收敛速度与使用完整矩阵$B$的黎曼方法相同。实验表明,该方法在涉及广义正交约束的多种机器学习应用(包括CCA、ICA和GEVP)中具有有效性。