In this paper, we focus on mean-field variational Bayesian Neural Networks (BNNs) and explore the representation capacity of such BNNs by investigating which types of concepts are less likely to be encoded by the BNN. It has been observed and studied that a relatively small set of interactive concepts usually emerge in the knowledge representation of a sufficiently-trained neural network, and such concepts can faithfully explain the network output. Based on this, our study proves that compared to standard deep neural networks (DNNs), it is less likely for BNNs to encode complex concepts. Experiments verify our theoretical proofs. Note that the tendency to encode less complex concepts does not necessarily imply weak representation power, considering that complex concepts exhibit low generalization power and high adversarial vulnerability. The code is available at https://github.com/sjtu-xai-lab/BNN-concepts.
翻译:本文聚焦于平均场变分贝叶斯神经网络(BNN),通过探究此类BNN不太可能编码哪些类型的概念来研究其表示能力。已有观察和研究表明,经过充分训练的神经网络的知识表示中通常会出现一组相对较小的交互概念,这些概念能够忠实地解释网络输出。基于此,我们的研究证明,与标准深度神经网络(DNN)相比,BNN编码复杂概念的可能性更低。实验验证了我们的理论证明。值得注意的是,考虑到复杂概念具有较低的泛化能力和较高的对抗性脆弱性,编码较简单概念的趋势并不必然意味着表示能力较弱。代码可在https://github.com/sjtu-xai-lab/BNN-concepts 获取。