This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving away from the computationally expensive traditional meta-parameter search methods, we introduce MetaOptimize framework that dynamically adjusts meta-parameters, particularly step sizes (also known as learning rates), during training. More specifically, MetaOptimize can wrap around any first-order optimization algorithm, tuning step sizes on the fly to minimize a specific form of regret that accounts for long-term effect of step sizes on training, through a discounted sum of future losses. We also introduce low complexity variants of MetaOptimize that, in conjunction with its adaptability to multiple optimization algorithms, demonstrate performance competitive to those of best hand-crafted learning rate schedules across various machine learning applications.
翻译:本文针对机器学习算法中元参数(即超参数)优化这一关键挑战展开研究,该参数直接影响训练效率与模型性能。为摆脱传统元参数搜索方法计算成本高昂的困境,我们提出MetaOptimize框架,该框架能在训练过程中动态调整元参数,特别是步长(亦称学习率)。具体而言,MetaOptimize可封装任何一阶优化算法,通过考虑步长对训练的长期影响(基于未来损失的折现求和),在线调整步长以最小化特定形式的遗憾值。我们还提出了MetaOptimize的低复杂度变体,该变体兼具对多种优化算法的适应性,在各类机器学习应用中展现出与最优人工设计学习率调度方案相媲美的性能。