Neural Processes (NPs) are deep probabilistic models that represent stochastic processes by conditioning their prior distributions on a set of context points. Despite their obvious advantages in uncertainty estimation for complex distributions, NPs enforce parameterization coupling between the conditional prior model and the posterior model, thereby risking introducing a misspecified prior distribution. We hereby revisit the NP objectives and propose R\'enyi Neural Processes (RNP) to ameliorate the impacts of prior misspecification by optimizing an alternative posterior that achieves better marginal likelihood. More specifically, by replacing the standard KL divergence with the R\'enyi divergence between the model posterior and the true posterior, we scale the density ratio $\frac{p}{q}$ by the power of (1-$\alpha$) in the divergence gradients with respect to the posterior. This hyper parameter $\alpha$ allows us to dampen the effects of the misspecified prior for the posterior update, which has been shown to effectively avoid oversmoothed predictions and improve the expressiveness of the posterior model. Our extensive experiments show consistent log-likelihood improvements over state-of-the-art NP family models which adopt both the variational inference or maximum likelihood estimation objectives. We validate the effectiveness of our approach across multiple benchmarks including regression and image inpainting tasks, and show significant performance improvements of RNPs in real-world regression problems where the underlying prior model is misspecifed.
翻译:神经过程(Neural Processes, NPs)是一种深度概率模型,通过将先验分布条件化于一组上下文点来表示随机过程。尽管NPs在复杂分布的不确定性估计方面具有显著优势,但其强制条件先验模型与后验模型之间的参数化耦合,从而可能引入错误设定的先验分布。本文重新审视NP的目标函数,提出Rényi神经过程(Rényi Neural Processes, RNP),通过优化能够获得更好边缘似然的替代后验分布来缓解先验设定错误的影响。具体而言,通过用模型后验与真实后验之间的Rényi散度替代标准KL散度,我们在散度关于后验的梯度中将密度比$\frac{p}{q}$缩放(1-$\alpha$)次幂。该超参数$\alpha$使我们能够抑制错误设定先验对后验更新的影响,这已被证明能有效避免过度平滑的预测并提升后验模型的表达能力。大量实验表明,相较于采用变分推断或最大似然估计目标函数的最先进NP系列模型,RNP在对数似然指标上取得了一致的提升。我们在包括回归和图像修复在内的多个基准测试中验证了方法的有效性,并在真实世界先验模型设定错误的回归问题上展示了RNP的显著性能改进。