Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practical function approximators, with many benefits over standard GPs such as the ability to naturally handle non-stationarity and learn representations for high-dimensional data. In this paper, we study BNNs as alternatives to standard GP surrogates for optimization. We consider a variety of approximate inference procedures for finite-width BNNs, including high-quality Hamiltonian Monte Carlo, low-cost stochastic MCMC, and heuristics such as deep ensembles. We also consider infinite-width BNNs, linearized Laplace approximations, and partially stochastic models such as deep kernel learning. We evaluate this collection of surrogate models on diverse problems with varying dimensionality, number of objectives, non-stationarity, and discrete and continuous inputs. We find: (i) the ranking of methods is highly problem dependent, suggesting the need for tailored inductive biases; (ii) HMC is the most successful approximate inference procedure for fully stochastic BNNs; (iii) full stochasticity may be unnecessary as deep kernel learning is relatively competitive; (iv) deep ensembles perform relatively poorly; (v) infinite-width BNNs are particularly promising, especially in high dimensions.
翻译:贝叶斯优化是一种针对查询代价高昂的目标函数进行优化的高效方法。这些目标函数通常由高斯过程(GP)替代模型表示,该类模型易于优化且支持精确推断。尽管标准GP替代模型已在贝叶斯优化中确立了稳固地位,但贝叶斯神经网络(BNN)近期已成为实用的函数逼近器,相比标准GP具有诸多优势,例如能够自然地处理非平稳性并学习高维数据的表示。本文研究BNN作为标准GP替代模型在优化中的替代方案。我们考虑了有限宽度BNN的多种近似推断方法,包括高质量哈密顿蒙特卡洛、低代价随机MCMC以及深度集成等启发式方法。此外,还研究了无限宽度BNN、线性化拉普拉斯近似以及深度核学习等部分随机模型。我们在具有不同维度、目标数量、非平稳性以及离散与连续输入的多样问题上评估了这组替代模型。研究发现:(i)方法的排序高度依赖于具体问题,这表明需要定制化归纳偏置;(ii)HMC是全随机BNN最成功的近似推断方法;(iii)完全随机性可能并非必要,因为深度核学习具有相对竞争力;(iv)深度集成表现相对较差;(v)无限宽度BNN尤其具有潜力,特别是在高维场景中。