Hyperparameter optimization is critical in modern machine learning, requiring expert knowledge, numerous trials, and high computational and human resources. Despite the advancements in Automated Machine Learning (AutoML), challenges in terms of trial efficiency, setup complexity, and interoperability still persist. To address these issues, we introduce a novel paradigm leveraging Large Language Models (LLMs) to automate hyperparameter optimization across diverse machine learning tasks, which is named AgentHPO (short for LLM Agent-based Hyperparameter Optimization). Specifically, AgentHPO processes the task information autonomously, conducts experiments with specific hyperparameters (HPs), and iteratively optimizes them based on historical trials. This human-like optimization process largely reduces the number of required trials, simplifies the setup process, and enhances interpretability and user trust, compared to traditional AutoML methods. Extensive empirical experiments conducted on 12 representative machine-learning tasks indicate that AgentHPO not only matches but also often surpasses the best human trials in terms of performance while simultaneously providing explainable results. Further analysis sheds light on the strategies employed by the LLM in optimizing these tasks, highlighting its effectiveness and adaptability in various scenarios.
翻译:超参数优化是现代机器学习中的关键环节,需要专家知识、大量试验以及高昂的计算与人力资源。尽管自动化机器学习(AutoML)已取得进展,但在试验效率、配置复杂性和可解释性方面仍存在挑战。为解决这些问题,我们提出一种新型范式——利用大型语言模型(LLM)自动化跨多样机器学习任务的超参数优化,命名为AgentHPO(基于LLM智能体的超参数优化)。具体而言,AgentHPO自主处理任务信息,基于特定超参数(HPs)开展试验,并根据历史试验迭代优化超参数。与传统AutoML方法相比,这种类人优化过程大幅减少了所需试验次数,简化了配置流程,并增强了可解释性与用户信任度。在12项代表性机器学习任务上进行的大量实证实验表明:AgentHPO在性能上不仅匹配、甚至常超越人类最优试验结果,同时提供可解释性结论。进一步分析揭示了LLM在优化这些任务时采用的策略,凸显了其在多种场景下的有效性与适应性。