Hyperparameter optimization is an important subfield of machine learning that focuses on tuning the hyperparameters of a chosen algorithm to achieve peak performance. Recently, there has been a stream of methods that tackle the issue of hyperparameter optimization, however, most of the methods do not exploit the dominant power law nature of learning curves for Bayesian optimization. In this work, we propose Deep Power Laws (DPL), an ensemble of neural network models conditioned to yield predictions that follow a power-law scaling pattern. Our method dynamically decides which configurations to pause and train incrementally by making use of gray-box evaluations. We compare our method against 7 state-of-the-art competitors on 3 benchmarks related to tabular, image, and NLP datasets covering 59 diverse tasks. Our method achieves the best results across all benchmarks by obtaining the best any-time results compared to all competitors.
翻译:超参数优化是机器学习的一个重要子领域,专注于调整选定算法的超参数以实现最佳性能。近年来,涌现出大量解决超参数优化问题的方法,然而,大多数方法并未利用学习曲线在贝叶斯优化中的幂律主导特性。本文提出深度幂律(DPL)方法,这是一种基于神经网络模型的集成方法,其输出被约束为遵循幂律缩放模式。该方法通过利用灰盒评估动态决定暂停哪些配置并增量训练。我们在涵盖表格数据、图像数据和自然语言处理数据的3个基准测试(共59项不同任务)上,将所提方法与7种最先进的对比方法进行了比较。实验结果表明,所提方法在所有基准测试中均取得最佳结果,并在任意时间点上的性能均优于所有对比方法。