Neural Processes (NPs) are a popular class of approaches for meta-learning. Similar to Gaussian Processes (GPs), NPs define distributions over functions and can estimate uncertainty in their predictions. However, unlike GPs, NPs and their variants suffer from underfitting and often have intractable likelihoods, which limit their applications in sequential decision making. We propose Transformer Neural Processes (TNPs), a new member of the NP family that casts uncertainty-aware meta learning as a sequence modeling problem. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. The model architecture respects the inductive biases inherent to the problem structure, such as invariance to the observed data points and equivariance to the unobserved points. We further investigate knobs within the TNP framework that tradeoff expressivity of the decoding distribution with extra computation. Empirically, we show that TNPs achieve state-of-the-art performance on various benchmark problems, outperforming all previous NP variants on meta regression, image completion, contextual multi-armed bandits, and Bayesian optimization.
翻译:神经过程(NPs)是一类流行的元学习方法。与高斯过程(GPs)类似,NPs定义了函数上的分布并能估计预测中的不确定性。然而,与GPs不同,NPs及其变体存在欠拟合问题且通常具有难以处理的似然函数,这限制了它们在序列决策中的应用。我们提出Transformer神经过程(TNPs)——神经过程家族的新成员,将不确定性感知元学习建模为序列建模问题。我们通过自回归似然目标学习TNPs,并采用基于Transformer的新型架构实现该目标。该模型架构尊重问题结构固有的归纳偏好,例如对观测数据点的不变性和对未观测点的等变性。我们进一步研究了TNP框架中通过额外计算来权衡解码分布表达能力的调控参数。实验表明,TNPs在各类基准问题上实现了最先进性能,在元回归、图像补全、情境化多臂老虎机及贝叶斯优化中均超越了所有先前的NP变体。