Popular machine learning approaches forgo second-order information due to the difficulty of computing curvature in high dimensions. We present FOSI, a novel meta-algorithm that improves the performance of any base first-order optimizer by efficiently incorporating second-order information during the optimization process. In each iteration, FOSI implicitly splits the function into two quadratic functions defined on orthogonal subspaces, then uses a second-order method to minimize the first, and the base optimizer to minimize the other. We formally analyze FOSI's convergence and the conditions under which it improves a base optimizer. Our empirical evaluation demonstrates that FOSI improves the convergence rate and optimization time of first-order methods such as Heavy-Ball and Adam, and outperforms second-order methods (K-FAC and L-BFGS).
翻译:流行的机器学习方法因高维曲率计算困难而舍弃二阶信息。我们提出FOSI——一种新型元算法,通过高效地融入二阶信息来提升任意基础一阶优化器的性能。在每次迭代中,FOSI将函数隐式分解为定义在正交子空间上的两个二次函数,随后采用二阶方法优化前者,并利用基础优化器处理后者。我们严格分析了FOSI的收敛性及其改进基础优化器的适用条件。实验评估表明,FOSI能提升Heavy-Ball、Adam等一阶方法的收敛速度与优化时间,且表现优于K-FAC、L-BFGS等二阶方法。