This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on manual approaches in limited-budget settings. By prompting LLMs with dataset and model descriptions, we develop a methodology where LLMs suggest hyperparameter configurations, which are iteratively refined based on model performance. Our empirical evaluations on standard benchmarks reveal that within constrained search budgets, LLMs can match or outperform traditional HPO methods like Bayesian optimization across different models on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs and affords greater flexibility than existing HPO approaches.
翻译:本文探讨了基础性大型语言模型(LLMs)在超参数优化(HPO)中的应用。超参数对于决定机器学习模型的有效性至关重要,但在有限预算场景下,其优化通常依赖于手动方法。通过向LLMs提供数据集和模型描述作为提示,我们开发了一种方法,使LLMs能够提出超参数配置,并根据模型性能进行迭代优化。我们在标准基准测试上的实证评估表明,在受限的搜索预算内,LLMs能够匹配甚至超越传统HPO方法(如贝叶斯优化)在不同模型上的表现。此外,我们提出将指定模型的代码本身视为一个超参数,由LLM生成,这比现有的HPO方法提供了更大的灵活性。