Bayesian optimization (BO) is a powerful tool for seeking the global optimum of black-box functions. While evaluations of the black-box functions can be highly costly, it is desirable to reduce the use of expensive labeled data. For the first time, we introduce a teacher-student model to exploit semi-supervised learning that can make use of large amounts of unlabelled data under the context of BO. Importantly, we show that the selection of the validation and unlabeled data is key to the performance of BO. To optimize the sampling of unlabeled data, we employ a black-box parameterized sampling distribution optimized as part of the employed bi-level optimization framework. Taking one step further, we demonstrate that the performance of BO can be further improved by selecting unlabeled data from a dynamically fitted extreme value distribution. Our BO method operates in a learned latent space with reduced dimensionality, making it scalable to high-dimensional problems. The proposed approach outperforms significantly the existing BO methods on several synthetic and real-world optimization tasks.
翻译:贝叶斯优化(BO)是寻找黑箱函数全局最优解的强大工具。由于黑箱函数的评估成本可能极高,因此需要减少昂贵标注数据的使用。我们首次引入师生模型,以在BO背景下利用半监督学习方法,该方法能够有效利用大量无标注数据。重要的是,我们证明了验证集与无标注数据的选择对BO性能具有关键影响。为优化无标注数据的采样,我们采用黑箱参数化采样分布,并将其作为双层优化框架的组成部分进行联合优化。进一步地,我们证明通过从动态拟合的极值分布中选取无标注数据,可进一步提升BO性能。本方法在低维潜空间中运行,具备可扩展至高维问题的能力。在多个合成任务与真实优化任务上,所提出的方法显著优于现有BO方法。