Bayesian optimization (BO) selects evaluation points for expensive black-box objectives using Gaussian process (GP) predictive distributions. Kernel choice and hyperparameter selection can lead to miscalibrated predictive distributions and an inappropriate exploration-exploitation trade-off. For minimization, sampling criteria such as expected improvement (EI) depend on the predictive distribution below the current best value, so lower-tail miscalibration directly affects the sampling decision. This article studies goal-oriented calibration of GP predictive distributions below a low threshold $t$ in the noiseless setting, for standard GP models with hyperparameters selected by maximum likelihood. A framework for predictive reliability below $t$ is introduced, based on two notions of spatial calibration: occurrence calibration over the design space and thresholded $μ$-calibration on sublevel sets of the form $\{x\in\mathbb{X}, f(x)\le t\}$. Building on this framework, we propose tcGP, a post-hoc method that calibrates GP predictive distributions below~$t$, and we show that the resulting EI-based global optimization algorithm remains dense in the design space. Experiments on standard benchmarks show improved lower-tail calibration and BO performance relative to standard GP models and globally calibrated GP models.
翻译:贝叶斯优化(BO)利用高斯过程(GP)预测分布,为昂贵的黑箱目标函数选择评估点。核函数选择与超参数选取可能导致预测分布校准偏差,进而引发不恰当的探索-利用权衡。针对最小化问题,期望改进(EI)等采样标准依赖于当前最优值以下的预测分布,因此下尾校准偏差直接影响采样决策。本文研究了在无噪声条件下,针对通过最大似然选取超参数的标准GP模型,实现低于低阈值$t$的高斯过程预测分布面向目标校准问题。基于空间校准的两种概念:设计空间上的发生率校准与形式为$\{x\in\mathbb{X}, f(x)\le t\}$的子水平集上的阈值化$\mu$-校准,引入了低于$t$的预测可靠性框架。在该框架基础上,我们提出tcGP——一种校正低于$t$的GP预测分布的事后方法,并证明基于该方法的EI全局优化算法在空间上保持稠密性。标准基准实验表明,相较于标准GP模型与全局校准GP模型,该方法能有效改善下尾校准性能与贝叶斯优化效果。