A long-standing belief holds that Bayesian Optimization (BO) with standard Gaussian processes (GP) -- referred to as standard BO -- underperforms in high-dimensional optimization problems. While this belief seems plausible, it lacks both robust empirical evidence and theoretical justification. To address this gap, we present a systematic investigation. First, through a comprehensive evaluation across twelve benchmarks, we found that while the popular Square Exponential (SE) kernel often leads to poor performance, using Mat\'ern kernels enables standard BO to consistently achieve top-tier results, frequently surpassing methods specifically designed for high-dimensional optimization. Second, our theoretical analysis reveals that the SE kernel's failure primarily stems from improper initialization of the length-scale parameters, which are commonly used in practice but can cause gradient vanishing in training. We provide a probabilistic bound to characterize this issue, showing that Mat\'ern kernels are less susceptible and can robustly handle much higher dimensions. Third, we propose a simple robust initialization strategy that dramatically improves the performance of the SE kernel, bringing it close to state-of-the-art methods, without requiring additional priors or regularization. We prove another probabilistic bound that demonstrates how the gradient vanishing issue can be effectively mitigated with our method. Our findings advocate for a re-evaluation of standard BO's potential in high-dimensional settings.
翻译:长期以来存在一种普遍观点,认为采用标准高斯过程(简称标准BO)的贝叶斯优化在高维优化问题中表现不佳。尽管这一观点看似合理,但其既缺乏坚实的实证依据,也缺少理论支撑。为填补这一空白,我们开展了系统性研究。首先,通过对十二个基准测试的全面评估,我们发现:虽然常用的平方指数核常导致性能低下,但使用Matérn核能使标准BO持续取得顶尖结果,其表现往往优于专门针对高维优化设计的方法。其次,理论分析表明,平方指数核的失效主要源于长度尺度参数的初始化不当——这种初始化方式在实践中被广泛采用,却可能导致训练中的梯度消失。我们通过概率界定量化该问题,证明Matérn核对此类问题具有更强鲁棒性,能稳定处理更高维度。再次,我们提出一种简洁的鲁棒初始化策略,可显著提升平方指数核的性能,使其接近最先进方法水平,且无需引入额外先验或正则化。通过另一概率界限的证明,我们展示了该方法如何有效缓解梯度消失问题。本研究主张重新评估标准BO在高维场景中的潜力。