Hyperparameter optimization is an important subfield of machine learning that focuses on tuning the hyperparameters of a chosen algorithm to achieve peak performance. Recently, there has been a stream of methods that tackle the issue of hyperparameter optimization, however, most of the methods do not exploit the scaling law property of learning curves. In this work, we propose Deep Power Laws (DPL), an ensemble of neural network models conditioned to yield predictions that follow a power-law scaling pattern. Our method dynamically decides which configurations to pause and train incrementally by making use of gray-box evaluations. We compare our method against 7 state-of-the-art competitors on 3 benchmarks related to tabular, image, and NLP datasets covering 57 diverse tasks. Our method achieves the best results across all benchmarks by obtaining the best any-time results compared to all competitors.
翻译:超参数优化是机器学习领域中的一个重要子领域,专注于调整所选算法的超参数以实现最佳性能。近年来,出现了一系列解决超参数优化问题的方法,然而大多数方法并未利用学习曲线的标度律特性。在本研究中,我们提出了深度幂律(Deep Power Laws,DPL)方法,这是一种神经网络模型集成,其条件输出遵循幂律标度模式的预测。我们的方法通过利用灰盒评估,动态决定暂停哪些配置并逐步进行训练。我们将所提方法与7个最先进的竞争方法在涵盖表格、图像和NLP数据集的3个基准上进行了比较,这些基准涉及57个不同的任务。我们的方法在所有基准上均取得了最佳结果,即在所有时间点上,其性能均优于所有竞争方法。