We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed. Even in the limit case, where convergence is guaranteed, our learned optimization algorithms provably outperform related algorithms based on a (deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families. By generalizing existing ideas, we reformulate the learning procedure into a one-dimensional minimization problem and study the possibility to find a global minimum, which enables the algorithmic realization of the learning procedure. As a proof-of-concept, we learn hyperparameters of standard optimization algorithms to empirically underline our theory.
翻译:我们将PAC-Bayes理论应用于学习优化(learning-to-optimize)场景。据我们所知,本文首次提出了一个框架,用于学习具有可证明泛化保证(PAC界)的优化算法,并在高概率收敛与高收敛速度之间实现了显式权衡。即使在保证收敛的极限情形下,我们学习的优化算法也在理论上优于基于(确定性)最坏情况分析的相关算法。我们的结果依赖于基于指数族的通用无界损失函数的PAC-Bayes界。通过泛化现有思想,我们将学习过程重新表述为一维最小化问题,并研究寻找全局最小值的可能性,从而实现了学习过程的算法化。作为概念验证,我们学习了标准优化算法的超参数,以从实证角度支撑我们的理论。