In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors in high-dimensional spaces. Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler. This is distinct from existing methods that rely on additional discriminator networks and unstable adversarial objectives. Furthermore, we present a new sampler architecture that, for the first time, enables implicit distributions over tens of millions of latent variables, addressing computational concerns by using differentiable numerical approximations. We empirically show that our method is capable of recovering correlations across layers in large Bayesian neural networks, a property that is crucial for a network's performance but notoriously challenging to achieve. To the best of our knowledge, no other method has been shown to accomplish this task for such large models. Through experiments in downstream tasks, we demonstrate that our expressive posteriors outperform state-of-the-art uncertainty quantification methods, validating the effectiveness of our training algorithm and the quality of the learned implicit approximation.
翻译:在变分推断中,贝叶斯模型的效果依赖于精确捕获真实后验分布。我们提出使用指定隐式分布的神经采样器,这类采样器特别适合近似高维空间中复杂的多模态和相关后验分布。我们的方法通过局部线性化神经采样器,引入了使用隐式分布进行近似推断的新型界。这与现有依赖额外判别器网络和不稳定对抗目标的方法截然不同。此外,我们提出一种新的采样器架构,首次实现了对数千万个潜在变量的隐式分布,通过使用可微数值近似解决了计算问题。实验表明,我们的方法能够恢复大型贝叶斯神经网络中跨层的相关性,这一性质对网络性能至关重要但极难实现。据我们所知,尚无其他方法能在大规模模型上完成此任务。通过下游任务实验,我们证明具有表达力的后验分布优于最先进的不确定性量化方法,验证了训练算法的有效性以及所学隐式近似的质量。