Among various acquisition functions (AFs) in Bayesian optimization (BO), Gaussian process upper confidence bound (GP-UCB) and Thompson sampling (TS) are well-known options with established theoretical properties regarding Bayesian cumulative regret (BCR). Recently, it has been shown that a randomized variant of GP-UCB achieves a tighter BCR bound compared with GP-UCB, which we call the tighter BCR bound for brevity. Inspired by this study, this paper first shows that TS achieves the tighter BCR bound. On the other hand, GP-UCB and TS often practically suffer from manual hyperparameter tuning and over-exploration issues, respectively. Therefore, we analyze yet another AF called a probability of improvement from the maximum of a sample path (PIMS). We show that PIMS achieves the tighter BCR bound and avoids the hyperparameter tuning, unlike GP-UCB. Furthermore, we demonstrate a wide range of experiments, focusing on the effectiveness of PIMS that mitigates the practical issues of GP-UCB and TS.
翻译:在贝叶斯优化(BO)的各种采集函数(AFs)中,高斯过程上置信界(GP-UCB)和汤普森采样(TS)是具备关于贝叶斯累积遗憾(BCR)的成熟理论性质的两个著名选项。最近的研究表明,GP-UCB的一种随机化变体相较于GP-UCB实现了更紧致的BCR界,为简洁起见,我们称之为更紧致BCR界。受此研究启发,本文首先证明了TS也能达到该更紧致BCR界。另一方面,GP-UCB和TS在实际应用中常分别面临手动超参数调整和过度探索的问题。因此,我们分析了另一种称为“基于样本路径最大值的改进概率”(PIMS)的采集函数。我们证明了PIMS能够达到更紧致BCR界,并且与GP-UCB不同,它避免了超参数调整。此外,我们通过广泛的实验,重点展示了PIMS在缓解GP-UCB和TS实际应用问题方面的有效性。