Bayesian optimisation requires fitting a Gaussian process model, which in turn requires specifying hyperparameters - most of the theoretical literature assumes those hyperparameters are known. The commonly used maximum likelihood estimator for hyperparameters of the Gaussian process is consistent only if the data fills the space uniformly, which does not have to be the case in Bayesian optimisation. Since no guarantees exist regarding the correctness of hyperparameter estimation, and those hyperparameters can significantly affect the Gaussian process fit, theoretical analysis of Bayesian optimisation with unknown hyperparameters is very challenging. Previously proposed algorithms with the no-regret property were only able to handle the special case of unknown lengthscales, reproducing kernel Hilbert space norm and applied only to the frequentist case. We propose a novel algorithm, HE-GP-UCB, which is the first algorithm enjoying the no-regret property in the case of unknown hyperparameters of arbitrary form, and which supports both Bayesian and frequentist settings. Our proof idea is novel and can easily be extended to other variants of Bayesian optimisation. We show this by extending our algorithm to the adversarially robust optimisation setting under unknown hyperparameters. Finally, we empirically evaluate our algorithm on a set of toy problems and show that it can outperform the maximum likelihood estimator.
翻译:贝叶斯优化需要拟合高斯过程模型,而该过程又需要指定超参数——大多数理论文献假设这些超参数是已知的。用于高斯过程超参数的常用最大似然估计量仅在数据均匀填充空间时具有一致性,而这一条件在贝叶斯优化中并不一定成立。由于超参数估计的正确性缺乏理论保障,且这些超参数会显著影响高斯过程拟合效果,因此未知超参数情况下的贝叶斯优化理论分析极具挑战性。此前提出的具有无后悔特性的算法仅能处理未知长度尺度这一特例、再生核希尔伯特空间范数,且仅适用于频率派场景。我们提出了一种新型算法HE-GP-UCB,这是首个在任意形式未知超参数情形下具备无后悔特性的算法,同时支持贝叶斯派与频率派设置。我们的证明思路具有创新性,可轻松扩展至其他贝叶斯优化变体。通过将该算法扩展至未知超参数下的对抗鲁棒优化场景,我们验证了其扩展性。最后,在基准测试问题上的实验表明,该算法性能优于最大似然估计方法。