Bayesian neural networks (BNNs) have recently gained popularity due to their ability to quantify model uncertainty. However, specifying a prior for BNNs that captures relevant domain knowledge is often extremely challenging. In this work, we propose a framework for integrating general forms of domain knowledge (i.e., any knowledge that can be represented by a loss function) into a BNN prior through variational inference, while enabling computationally efficient posterior inference and sampling. Specifically, our approach results in a prior over neural network weights that assigns high probability mass to models that better align with our domain knowledge, leading to posterior samples that also exhibit this behavior. We show that BNNs using our proposed domain knowledge priors outperform those with standard priors (e.g., isotropic Gaussian, Gaussian process), successfully incorporating diverse types of prior information such as fairness, physics rules, and healthcare knowledge and achieving better predictive performance. We also present techniques for transferring the learned priors across different model architectures, demonstrating their broad utility across various settings.
翻译:贝叶斯神经网络(BNN)因其量化模型不确定性的能力近年来受到广泛关注。然而,为BNN设定能够捕获相关领域知识的先验通常极具挑战性。本文提出一个框架,通过变分推断将通用形式的领域知识(即可通过损失函数表示的任何知识)整合到BNN先验中,同时实现计算高效的后验推断与采样。具体而言,我们的方法生成一种神经网络权重的先验分布,该分布将高概率质量赋予更符合领域知识的模型,进而使后验样本也呈现相同特性。实验表明,采用所提出领域知识先验的BNN在性能上优于使用标准先验(如各向同性高斯先验、高斯过程先验)的模型,成功整合了公平性、物理规则及医疗知识等多种先验信息,并取得了更优的预测性能。我们还提出了跨不同模型架构迁移所学先验的技术,证明了该方法在多种场景下的广泛适用性。