Gradient boosted trees (GBTs) are ubiquitous models used by researchers, machine learning (ML) practitioners, and data scientists because of their robust performance, interpretable behavior, and ease-of-use. One critical challenge in training GBTs is the tuning of their hyperparameters. In practice, selecting these hyperparameters is often done manually. Recently, the ML community has advocated for tuning hyperparameters through black-box optimization and developed state-of-the-art systems to do so. However, applying such systems to tune GBTs suffers from two drawbacks. First, these systems are not \textit{model-aware}, rather they are designed to apply to a \textit{generic} model; this leaves significant optimization performance on the table. Second, using these systems requires \textit{domain knowledge} such as the choice of hyperparameter search space, which is an antithesis to the automatic experimentation that black-box optimization aims to provide. In this paper, we present SigOpt Mulch, a model-aware hyperparameter tuning system specifically designed for automated tuning of GBTs that provides two improvements over existing systems. First, Mulch leverages powerful techniques in metalearning and multifidelity optimization to perform model-aware hyperparameter optimization. Second, it automates the process of learning performant hyperparameters by making intelligent decisions about the optimization search space, thus reducing the need for user domain knowledge. These innovations allow Mulch to identify good GBT hyperparameters far more efficiently -- and in a more seamless and user-friendly way -- than existing black-box hyperparameter tuning systems.
翻译:梯度提升树(GBT)因其稳健的性能、可解释的行为和易用性,成为研究人员、机器学习从业者和数据科学家广泛使用的模型。训练GBT的一个关键挑战是其超参数的调优。在实践中,这些超参数的选择通常依赖人工操作。近年来,机器学习社区提倡通过黑箱优化来调优超参数,并开发了先进系统来实现这一目标。然而,应用此类系统调优GBT存在两个缺陷:首先,这些系统并非"模型感知型",而是设计用于"通用"模型,这导致优化性能存在显著提升空间;其次,使用这些系统需要领域知识(如超参数搜索空间的选择),这与黑箱优化旨在实现的自动化实验相悖。本文提出SigOpt Mulch——一个专为GBT自动调优设计的模型感知型超参数调优系统,相比现有系统具有两项改进:第一,Mulch利用元学习与多保真度优化等强大技术实现模型感知型超参数优化;第二,它通过智能决策优化搜索空间来自动学习高性能超参数,从而降低对用户领域知识的需求。这些创新使得Mulch能够比现有黑箱超参数调优系统更高效、更无缝且更友好地识别出优质的GBT超参数。