Bayesian optimization (BO) is widely adopted in black-box optimization problems and it relies on a surrogate model to approximate the black-box response function. With the increasing number of black-box optimization tasks solved and even more to solve, the ability to learn from multiple prior tasks to jointly pre-train a surrogate model is long-awaited to further boost optimization efficiency. In this paper, we propose a simple approach to pre-train a surrogate, which is a Gaussian process (GP) with a kernel defined on deep features learned from a Transformer-based encoder, using datasets from prior tasks with possibly heterogeneous input spaces. In addition, we provide a simple yet effective mix-up initialization strategy for input tokens corresponding to unseen input variables and therefore accelerate new tasks' convergence. Experiments on both synthetic and real benchmark problems demonstrate the effectiveness of our proposed pre-training and transfer BO strategy over existing methods.
翻译:贝叶斯优化(Bayesian optimization, BO)广泛应用于黑箱优化问题,其依赖代理模型逼近黑箱响应函数。随着已求解及待求解的黑箱优化任务数量不断增长,从多个先验任务中学习以联合预训练代理模型的能力,成为进一步提升优化效率的迫切需求。本文提出一种简洁的代理模型预训练方法:该方法采用基于Transformer编码器学习的深度特征定义核函数的高斯过程(Gaussian process, GP),利用来自可能具有异构输入空间的先验任务数据集进行预训练。此外,我们针对未见输入变量对应的输入标记提出一种简单而有效的混合初始化策略,从而加速新任务的收敛。在合成与真实基准问题上的实验结果表明,本文提出的预训练与迁移贝叶斯优化策略相较于现有方法具有显著优势。