Orthogonality constraints naturally appear in many machine learning problems, from Principal Components Analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin & Peyr\'e (2022) proposed the Landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraint but is attracted towards the manifold in a smooth manner. In this article, we provide new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints.
翻译:正交约束自然出现在许多机器学习问题中,从主成分分析到鲁棒神经网络训练。这类问题通常采用黎曼优化算法求解,即在满足约束的同时最小化目标函数。然而,施加正交约束往往是此类算法中最耗时的操作。近期,Ablin & Peyré (2022) 提出的"着陆算法"(Landing algorithm)以低成本的迭代过程避免了显式施加正交约束,但能使参数平滑地吸引至流形表面。本文为着陆算法提供了新的实践与理论进展:首先,将该方法推广至斯提费尔流形(Stiefel manifold),即矩形正交矩阵集合;其次,针对代价函数为多个函数均值的情形,我们进一步提出了随机及方差缩减算法;最终证明这些方法与严格施加约束的黎曼算法具有相同的收敛速度。实验表明,该方法在涉及正交约束的系列机器学习问题中展现出显著潜力。