In this paper, we use Prior-data Fitted Networks (PFNs) as a flexible surrogate for Bayesian Optimization (BO). PFNs are neural processes that are trained to approximate the posterior predictive distribution (PPD) through in-context learning on any prior distribution that can be efficiently sampled from. We describe how this flexibility can be exploited for surrogate modeling in BO. We use PFNs to mimic a naive Gaussian process (GP), an advanced GP, and a Bayesian Neural Network (BNN). In addition, we show how to incorporate further information into the prior, such as allowing hints about the position of optima (user priors), ignoring irrelevant dimensions, and performing non-myopic BO by learning the acquisition function. The flexibility underlying these extensions opens up vast possibilities for using PFNs for BO. We demonstrate the usefulness of PFNs for BO in a large-scale evaluation on artificial GP samples and three different hyperparameter optimization testbeds: HPO-B, Bayesmark, and PD1. We publish code alongside trained models at https://github.com/automl/PFNs4BO.
翻译:本文利用先验数据拟合网络(PFNs)作为贝叶斯优化(BO)的灵活替代模型。PFNs是一种神经过程,通过从可高效采样的任意先验分布中进行上下文学习,训练其近似后验预测分布(PPD)。我们阐述了如何利用这种灵活性进行BO中的替代建模:分别用PFNs模拟朴素高斯过程、高级高斯过程以及贝叶斯神经网络。此外,我们展示了如何将额外信息融入先验(例如允许提供最优位置提示的用户先验、忽略无关维度),并通过学习采集函数实现非近视贝叶斯优化。这些扩展的灵活性为将PFNs应用于BO开辟了广阔前景。我们通过大规模实验验证了PFNs在BO中的有效性:实验基于人工GP样本以及三个不同的超参数优化测试平台(HPO-B、Bayesmark和PD1)。本文在https://github.com/automl/PFNs4BO发布相关代码与预训练模型。