The increasing reliance on numerical methods for controlling dynamical systems and training machine learning models underscores the need to devise algorithms that dependably and efficiently navigate complex optimization landscapes. Classical gradient descent methods offer strong theoretical guarantees for convex problems; however, they demand meticulous hyperparameter tuning for non-convex ones. The emerging paradigm of learning to optimize (L2O) automates the discovery of algorithms with optimized performance leveraging learning models and data - yet, it lacks a theoretical framework to analyze convergence of the learned algorithms. In this paper, we fill this gap by harnessing nonlinear system theory. Specifically, we propose an unconstrained parametrization of all convergent algorithms for smooth non-convex objective functions. Notably, our framework is directly compatible with automatic differentiation tools, ensuring convergence by design while learning to optimize.
翻译:随着对控制动态系统和训练机器学习模型的数值方法依赖日益加深,设计能够可靠且高效地在复杂优化空间中导航的算法变得至关重要。经典的梯度下降方法为凸优化问题提供了坚实的理论保证;然而,对于非凸问题,它们需要细致的超参数调整。学习优化(L2O)这一新兴范式利用学习模型和数据自动发现具有优化性能的算法,但它缺乏分析所学算法收敛性的理论框架。在本文中,我们通过运用非线性系统理论填补了这一空白。具体而言,我们提出了针对光滑非凸目标函数的所有收敛算法的无约束参数化方法。值得注意的是,我们的框架直接兼容自动微分工具,从而在学习优化的同时通过设计确保收敛性。