In this work, we advocate for the importance of singular learning theory (SLT) as it pertains to the theory and practice of variational inference in Bayesian neural networks (BNNs). To begin, using SLT, we lay to rest some of the confusion surrounding discrepancies between downstream predictive performance measured via e.g., the test log predictive density, and the variational objective. Next, we use the SLT-corrected asymptotic form for singular posterior distributions to inform the design of the variational family itself. Specifically, we build upon the idealized variational family introduced in \citet{bhattacharya_evidence_2020} which is theoretically appealing but practically intractable. Our proposal takes shape as a normalizing flow where the base distribution is a carefully-initialized generalized gamma. We conduct experiments comparing this to the canonical Gaussian base distribution and show improvements in terms of variational free energy and variational generalization error.
翻译:本研究强调奇异学习理论(SLT)对于贝叶斯神经网络(BNNs)中变分推断理论与应用实践的重要性。首先,借助SLT,我们澄清了关于通过测试对数预测密度等指标衡量的下游预测性能与变分目标之间不一致性的若干困惑。接着,利用奇点后验分布的SLT校正渐近形式指导变分族的设计。具体而言,我们基于\citet{bhattacharya_evidence_2020}提出的理论理想但实际不可行的变分族框架进行扩展。所提方案采用归一化流结构,其基分布为经过精心初始化的广义伽马分布。通过实验对比经典高斯基分布,结果表明本方法在变分自由能与变分泛化误差方面均获得改善。