权重空间对称性导致变分推断失效：贝叶斯神经网络的置换不变后验 (Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks)

Weight space symmetries in neural network architectures, such as permutation symmetries in MLPs, give rise to Bayesian neural network (BNN) posteriors with many equivalent modes. This multimodality poses a challenge for variational inference (VI) techniques, which typically rely on approximating the posterior with a unimodal distribution. In this work, we investigate the impact of weight space permutation symmetries on VI. We demonstrate, both theoretically and empirically, that these symmetries lead to biases in the approximate posterior, which degrade predictive performance and posterior fit if not explicitly accounted for. To mitigate this behavior, we leverage the symmetric structure of the posterior and devise a symmetrization mechanism for constructing permutation invariant variational posteriors. We show that the symmetrized distribution has a strictly better fit to the true posterior, and that it can be trained using the original ELBO objective with a modified KL regularization term. We demonstrate experimentally that our approach mitigates the aforementioned biases and results in improved predictions and a higher ELBO.

翻译：神经网络架构中的权重空间对称性（例如多层感知机中的置换对称性）会导致贝叶斯神经网络后验分布存在大量等价模态。这种多模态特性对通常依赖单模态分布近似后验的变分推断技术构成了挑战。本研究系统探讨了权重空间置换对称性对变分推断的影响。我们通过理论分析和实证研究表明，若未显式处理这些对称性，将导致近似后验产生偏差，从而降低预测性能与后验拟合度。为改善此问题，我们利用后验分布的对称结构，设计了一种构建置换不变变分后验的对称化机制。我们证明对称化分布对真实后验具有严格更优的拟合度，且可通过改进KL正则化项的原ELBO目标函数进行训练。实验结果表明，该方法能有效缓解前述偏差，提升预测性能并获得更高的ELBO值。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日