BOOST: A Data-Driven Framework for the Automated Joint Selection of Kernel and Acquisition Functions in Bayesian Optimization

The performance of Bayesian optimization (BO), a highly sample-efficient method for expensive black-box problems, is critically governed by the selection of its hyperparameters, including the kernel and acquisition functions. This presents a significant practical challenge: an inappropriate combination of these can lead to poor performance and wasted evaluations. While individual improvements to kernel functions and acquisition functions have been actively explored, the joint and autonomous selection of the best pair of these fundamental hyperparameters has been largely overlooked. This forced practitioners to rely on heuristics or costly manual training. In this work, we propose a framework, BOOST (Bayesian Optimization with Optimal Kernel and Acquisition Function Selection Technique), that automates this selection. BOOST utilizes a simple offline evaluation stage to predict the performance of various kernel-acquisition function pairs and identify the most promising pair before committing to the expensive evaluation process. BOOST is a data-driven strategy selection procedure that evaluates kernel-acquisition pairs based on their empirical performance on the data-in-hand. At each iteration, previously observed points are partitioned into a reference set and a query set. These subsets play roles analogous to training and validation sets in machine learning: the reference set is used for model construction, while the query set represents unseen regions to retrospectively evaluate how effectively each candidate strategy progresses toward the target value. Experiments on synthetic benchmarks and machine learning hyperparameter optimization tasks demonstrate that BOOST consistently improves over fixed-hyperparameter BO and remains competitive with state-of-the-art adaptive methods, highlighting its robustness across diverse landscapes.

翻译：贝叶斯优化（BO）是一种适用于昂贵黑箱问题的高样本效率方法，其性能关键取决于超参数（包括核函数与采集函数）的选择。这带来了重大实践挑战：不恰当的核函数与采集函数组合可能导致性能低下和评估资源浪费。尽管针对核函数与采集函数的单独改进已被积极探索，但这些核心超参数的最佳配对联合自动选择在很大程度上被忽视，迫使从业者依赖启发式方法或代价高昂的人工训练。本文提出框架BOOST（Bayesian Optimization with Optimal Kernel and Acquisition Function Selection Technique），实现该选择的自动化。BOOST通过简单的离线评估阶段预测不同核函数-采集函数对的性能，并在投入昂贵评估过程前识别最有潜力的组合。作为一种数据驱动的策略选择方法，BOOST基于候选策略对现有数据集的实证表现进行评估。每次迭代中，将已观测点划分为参考集与查询集，分别类似于机器学习中的训练集与验证集：参考集用于模型构建，查询集代表未知区域，以回溯评估每种候选策略向目标值推进的有效性。在合成基准测试与机器学习超参数优化任务上的实验表明，BOOST始终优于固定超参数BO，且与最先进自适应方法保持竞争力，突显其在不同场景下的稳健性。