In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible surrogate for Bayesian Optimization (BO). PFNs are neural processes that are trained to approximate the posterior predictive distribution (PPD) through in-context learning on any prior distribution that can be efficiently sampled from. We describe how this flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network (BNN). In addition, we show how to incorporate further information into the prior, such as allowing hints about the position of optima (user priors), ignoring irrelevant dimensions, and performing non-myopic BO by learning the acquisition function. The flexibility underlying these extensions opens up vast possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for BO in a large-scale evaluation on artificial GP samples and three different hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish code alongside trained models at github.com/automl/PFNs4BO.
翻译:本文利用先验数据拟合网络(PFNs)作为贝叶斯优化(BO)的灵活代理模型。PFNs是一种神经过程,通过在任何可高效采样的先验分布上进行上下文学习,训练其近似后验预测分布(PPD)。我们阐述了如何利用这种灵活性为BO进行代理建模。具体地,我们使用PFNs分别模拟朴素高斯过程(GP)、高级GP和贝叶斯神经网络(BNN)。此外,我们展示了如何将额外信息融入先验,例如允许提示最优值位置(用户先验)、忽略无关维度以及通过学习采集函数实现非短视型BO。这些扩展所依托的灵活性为PFNs应用于BO开辟了广阔前景。我们在人工GP样本及三个不同超参数优化测试平台(HPO-B、Bayesmark和PD1)上开展了大规模评估,验证了PFNs在BO中的有效性。相关代码与预训练模型已发布在github.com/automl/PFNs4BO。