The increasing reliance on numerical methods for controlling dynamical systems and training machine learning models underscores the need to devise algorithms that dependably and efficiently navigate complex optimization landscapes. Classical gradient descent methods offer strong theoretical guarantees for convex problems; however, they demand meticulous hyperparameter tuning for non-convex ones. The emerging paradigm of learning to optimize (L2O) automates the discovery of algorithms with optimized performance leveraging learning models and data - yet, it lacks a theoretical framework to analyze convergence and robustness of the learned algorithms. In this paper, we fill this gap by harnessing nonlinear system theory. Specifically, we propose an unconstrained parametrization of all convergent algorithms for smooth non-convex objective functions. Notably, our framework is directly compatible with automatic differentiation tools, ensuring convergence by design while learning to optimize.
翻译:随着对动态系统控制和机器学习模型训练中数值方法依赖的日益增长,亟需开发能够可靠且高效地应对复杂优化场景的算法。经典梯度下降法为凸问题提供了坚实的理论保证,但在处理非凸问题时需要精细的超参数调优。新兴的"优化学习"(L2O)范式通过利用学习模型与数据实现了算法自动发现与性能优化,然而这一方法缺乏分析学习算法收敛性与鲁棒性的理论框架。本文通过引入非线性系统理论填补了这一空白。具体而言,我们针对光滑非凸目标函数,提出了所有可收敛算法的无约束参数化表达。值得关注的是,本框架与自动微分工具直接兼容,确保在优化学习过程中算法内禀地具备收敛性。