This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better than traditional HPO methods like random search and Bayesian optimization on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs, going beyond the capabilities of existing HPO approaches. Our findings suggest that LLMs are a promising tool for improving efficiency in the traditional decision-making problem of hyperparameter optimization.
翻译:本文研究利用基础大语言模型(LLMs)在超参数优化(HPO)过程中进行决策。实证评估表明,在搜索预算受限的情况下,LLMs在标准基准测试上的表现可与随机搜索、贝叶斯优化等传统HPO方法相当甚至更优。此外,我们提出将模型规范代码作为超参数由LLM输出,这超越了现有HPO方法的能力范畴。研究结果表明,LLMs是提升传统超参数优化决策问题效率的一种有前景的工具。