The expected improvement (EI) is one of the most popular acquisition functions for Bayesian optimization (BO) and has demonstrated good empirical performances in many applications for the minimization of simple regret. However, under the evaluation metric of cumulative regret, the performance of EI may not be competitive, and its existing theoretical regret upper bound still has room for improvement. To adapt the EI for better performance under cumulative regret, we introduce a novel quantity called the evaluation cost which is compared against the acquisition function, and with this, develop the expected improvement-cost (EIC) algorithm. In each iteration of EIC, a new point with the largest acquisition function value is sampled, only if that value exceeds its evaluation cost. If none meets this criteria, the current best point is resampled. This evaluation cost quantifies the potential downside of sampling a point, which is important under the cumulative regret metric as the objective function value in every iteration affects the performance measure. We establish in theory a high-probability regret upper bound of EIC based on the maximum information gain, which is tighter than the bound of existing EI-based algorithms. It is also comparable to the regret bound of other popular BO algorithms such as Thompson sampling (GP-TS) and upper confidence bound (GP-UCB). We further perform experiments to illustrate the improvement of EIC over several popular BO algorithms.
翻译:期望改进(EI)是贝叶斯优化(BO)中最常用的采集函数之一,在简单遗憾最小化的诸多应用场景中已展现出良好的实证性能。然而,在累积遗憾这一评价指标下,EI 的性能可能缺乏竞争力,且其现有理论遗憾上界仍有改进空间。为使 EI 在累积遗憾指标下获得更优性能,我们引入一个称为评估成本的新变量与采集函数进行比较,并据此提出期望改进-成本(EIC)算法。在 EIC 的每次迭代中,仅当某点的采集函数值超过其评估成本时,才对该最大采集函数值对应点进行采样;若无满足该条件的点,则对当前最优点进行重采样。该评估成本量化了采样点的潜在损失,这对累积遗憾指标至关重要,因为每次迭代的目标函数值都会影响性能度量。我们在理论上基于最大信息增益建立了 EIC 的高概率遗憾上界,该上界比现有基于 EI 算法的界限更紧,且可与汤普森采样(GP-TS)和上置信界(GP-UCB)等其他主流 BO 算法的遗憾界相媲美。我们进一步通过实验验证了 EIC 相较于多种主流 BO 算法的性能提升。