In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible surrogate for Bayesian Optimization (BO). PFNs are neural processes that are trained to approximate the posterior predictive distribution (PPD) for any prior distribution that can be efficiently sampled from. We describe how this flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network (BNN). In addition, we show how to incorporate further information into the prior, such as allowing hints about the position of optima (user priors), ignoring irrelevant dimensions, and performing non-myopic BO by learning the acquisition function. The flexibility underlying these extensions opens up vast possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for BO in a large-scale evaluation on artificial GP samples and three different hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish code alongside trained models at http://github.com/automl/PFNs4BO.
翻译:在本文中,我们使用先验数据拟合网络(PFNs)作为贝叶斯优化(BO)的灵活代理模型。PFNs是一种神经过程,能够针对任何可高效采样的先验分布,训练近似其后验预测分布(PPD)。我们阐述了如何利用这种灵活性在BO中构建代理模型:使用PFNs模拟朴素高斯过程(GP)、高级高斯过程以及贝叶斯神经网络(BNN)。此外,我们展示了如何将额外信息融入先验,例如引入关于最优值位置的提示(用户先验)、忽略无关维度,以及通过学习采集函数实现非短视BO。这些扩展所蕴含的灵活性为将PFNs应用于BO开辟了广阔的可能性。我们在人工GP样本的大规模评估以及三个不同的超参数优化测试平台(HPO-B、Bayesmark和PD1)上验证了PFNs在BO中的实用性。我们在http://github.com/automl/PFNs4BO发布了代码及预训练模型。