Neural Processes (NPs) are appealing due to their ability to perform fast adaptation based on a context set. This set is encoded by a latent variable, which is often assumed to follow a simple distribution. However, in real-word settings, the context set may be drawn from richer distributions having multiple modes, heavy tails, etc. In this work, we provide a framework that allows NPs' latent variable to be given a rich prior defined by a graphical model. These distributional assumptions directly translate into an appropriate aggregation strategy for the context set. Moreover, we describe a message-passing procedure that still allows for end-to-end optimization with stochastic gradients. We demonstrate the generality of our framework by using mixture and Student-t assumptions that yield improvements in function modelling and test-time robustness.
翻译:神经过程(NPs)因其能够基于上下文集进行快速自适应而备受关注。该上下文集由一个潜在变量编码,该变量通常被假定服从简单分布。然而,在实际应用中,上下文集可能来自具有多模态、重尾等特征的更丰富分布。本文提出一个框架,允许神经过程的潜在变量采用由图模型定义的丰富先验。这些分布假设可直接转化为适用于上下文集的适当聚合策略。此外,我们描述了一种消息传递过程,该过程仍能通过随机梯度实现端到端优化。我们通过使用混合分布和Student-t分布的假设来证明该框架的通用性,这些假设在函数建模和测试时鲁棒性方面均取得了改进。