Orthogonality constraints naturally appear in many machine learning problems, from principal component analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin & Peyr\'e (2022) proposed the landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraints but is attracted towards the manifold in a smooth manner. This article provides new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint, and converge to the manifold. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints.
翻译:正交约束自然地出现在许多机器学习问题中,从主成分分析到鲁棒神经网络训练。这类问题通常使用黎曼优化算法求解,这些算法在施加约束的同时最小化目标函数。然而,施加正交约束可能是此类算法中最耗时的操作。最近,Ablin & Peyré (2022) 提出了着陆算法,这是一种迭代成本低廉的方法,它不强制施加正交约束,而是以平滑的方式被吸引到流形上。本文为着陆算法提供了新的实践与理论进展。首先,该方法被推广到Stiefel流形,即矩形正交矩阵的集合。当成本函数是多个函数的平均值时,我们还考虑了随机与方差缩减算法。我们证明了所有这些方法都具有与其精确施加约束的黎曼对应方法相同的收敛速率,并且收敛到流形。最后,我们的实验证明了我们的方法在处理一系列涉及正交约束的机器学习问题上的潜力。