A longstanding belief holds that Bayesian Optimization (BO) with standard Gaussian processes (GP) -- referred to as standard BO -- underperforms in high-dimensional optimization problems. While this belief seems plausible, it lacks both robust empirical evidence and theoretical justification. To address this gap, we present a systematic investigation. First, through a comprehensive evaluation across eleven widely used benchmarks, we found that while the popular Square Exponential (SE) kernel often leads to poor performance, using Matern kernels enables standard BO to consistently achieve top-tier results, frequently surpassing methods specifically designed for high-dimensional optimization. Second, our theoretical analysis reveals that the SE kernels failure primarily stems from improper initialization of the length-scale parameters, which are commonly used in practice but can cause gradient vanishing in training. We provide a probabilistic bound to characterize this issue, showing that Matern kernels are less susceptible and can robustly handle much higher dimensions. Third, we propose a simple robust initialization strategy that dramatically improves the performance of the SE kernel, bringing it close to state of the art methods, without requiring any additional priors or regularization. We prove another probabilistic bound that demonstrates how the gradient vanishing issue can be effectively mitigated with our method. Our findings advocate for a re-evaluation of standard BOs potential in high-dimensional settings.
翻译:长期以来学界普遍认为,采用标准高斯过程的贝叶斯优化(简称标准BO)在高维优化问题中表现欠佳。尽管这一观点看似合理,但其既缺乏坚实的实证证据,也缺少理论依据。为填补这一空白,我们开展了系统性研究。首先,通过对十一个广泛使用的基准测试进行全面评估,我们发现:虽然常用的平方指数核常导致性能不佳,但采用马特恩核可使标准BO持续获得顶尖结果,其表现经常超越专门针对高维优化设计的方法。其次,我们的理论分析表明,平方指数核的失效主要源于长度尺度参数的初始化不当——这种初始化方式在实践中普遍使用,但会导致训练中的梯度消失问题。我们通过概率界定量化该问题,证明马特恩核对此类问题敏感性较低,并能稳健处理更高维度。第三,我们提出一种简单的鲁棒初始化策略,该策略显著提升了平方指数核的性能,使其接近最先进方法的水平,且无需任何额外先验或正则化。我们证明了另一个概率界限,表明所提方法能有效缓解梯度消失问题。本研究主张重新评估标准BO在高维场景中的潜力。