In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible surrogate for Bayesian Optimization (BO). PFNs are neural processes that are trained to approximate the posterior predictive distribution (PPD) through in-context learning on any prior distribution that can be efficiently sampled from. We describe how this flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network (BNN). In addition, we show how to incorporate further information into the prior, such as allowing hints about the position of optima (user priors), ignoring irrelevant dimensions, and performing non-myopic BO by learning the acquisition function. The flexibility underlying these extensions opens up vast possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for BO in a large-scale evaluation on artificial GP samples and three different hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish code alongside trained models at https://github.com/automl/PFNs4BO.
翻译:在本文中,我们采用先验数据拟合网络(PFNs)作为贝叶斯优化(BO)的灵活代理模型。PFNs是一种神经过程,通过在任何可从有效采样得到的先验分布上进行上下文学习,来训练逼近后验预测分布(PPD)。我们阐述了如何利用这种灵活性进行BO中的代理建模:使用PFNs模拟朴素高斯过程(GP)、先进GP以及贝叶斯神经网络(BNN)。此外,我们还展示了如何将额外信息纳入先验,例如允许引入关于最优位置位置的提示(用户先验)、忽略无关维度,以及通过学习采集函数实现非短视贝叶斯优化。这些扩展所蕴含的灵活性为将PFNs应用于BO开辟了广泛可能。我们在人工GP样本和三个不同的超参数优化测试平台(HPO-B、Bayesmark和PD1)上进行了大规模评估,验证了PFNs在BO中的有效性。相关代码与预训练模型已发布于 https://github.com/automl/PFNs4BO。