Predictive posterior densities (PPDs) are of interest in approximate Bayesian inference. Typically, these are estimated by simple Monte Carlo (MC) averages using samples from the approximate posterior. We observe that the signal-to-noise ratio (SNR) of such estimators can be extremely low. An analysis for exact inference reveals SNR decays exponentially as there is an increase in (a) the mismatch between training and test data, (b) the dimensionality of the latent space, or (c) the size of the test data relative to the training data. Further analysis extends these results to approximate inference. To remedy the low SNR problem, we propose replacing simple MC sampling with importance sampling using a proposal distribution optimized at test time on a variational proxy for the SNR and demonstrate that this yields greatly improved estimates.
翻译:预测后验密度在近似贝叶斯推断中具有重要意义。通常,这些密度通过使用近似后验样本的简单蒙特卡洛平均进行估计。我们观察到此类估计器的信噪比可能极低。对精确推断的分析表明,当出现以下情况时,信噪比会呈指数级衰减:(a) 训练数据与测试数据之间的失配增加,(b) 潜在空间维度增加,或 (c) 测试数据规模相对于训练数据增加。进一步的分析将这些结果扩展到近似推断。为解决低信噪比问题,我们提出在测试时使用基于信噪比变分代理优化的建议分布进行重要性采样,以替代简单的蒙特卡洛采样,并证明该方法能显著改善估计效果。