In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors in high-dimensional spaces. Our approach advances inference using implicit distributions by introducing novel bounds that come about by locally linearising the neural sampler. This is distinct from existing methods that rely on additional discriminator networks and unstable adversarial objectives. Furthermore, we present a new sampler architecture that, for the first time, enables implicit distributions over millions of latent variables, addressing computational concerns by using differentiable numerical approximations. Our empirical analysis indicates our method is capable of recovering correlations across layers in large Bayesian neural networks, a property that is crucial for a network's performance but notoriously challenging to achieve. To the best of our knowledge, no other method has been shown to accomplish this task for such large models. Through experiments in downstream tasks, we demonstrate that our expressive posteriors outperform state-of-the-art uncertainty quantification methods, validating the effectiveness of our training algorithm and the quality of the learned implicit approximation.
翻译:在变分推断中,贝叶斯模型的优势依赖于准确捕捉真实后验分布。我们提出使用指定隐式分布的神经采样器,这类采样器特别适合近似高维空间中复杂的多模态和相关后验分布。我们的方法通过引入新颖的界——通过对神经采样器进行局部线性化得到——推进了基于隐式分布的推断。这与依赖额外判别器网络和不稳定对抗目标的现有方法截然不同。此外,我们提出一种新的采样器架构,首次实现对数百万潜在变量的隐式分布,通过使用可微数值近似解决计算问题。我们的实证分析表明,该方法能够恢复大型贝叶斯神经网络中层间的相关性——这一特性对网络性能至关重要但极具挑战性。据我们所知,尚无其他方法被证明能在如此规模的模型上完成此任务。通过下游任务实验,我们证明具有表达力的后验分布优于最先进的不确定性量化方法,验证了训练算法的有效性以及所学隐式近似的质量。