We investigate the benefit of treating all the parameters in a Bayesian neural network stochastically and find compelling theoretical and empirical evidence that this standard construction may be unnecessary. To this end, we prove that expressive predictive distributions require only small amounts of stochasticity. In particular, partially stochastic networks with only $n$ stochastic biases are universal probabilistic predictors for $n$-dimensional predictive problems. In empirical investigations, we find no systematic benefit of full stochasticity across four different inference modalities and eight datasets; partially stochastic networks can match and sometimes even outperform fully stochastic networks, despite their reduced memory costs.
翻译:我们研究了在贝叶斯神经网络中对其所有参数进行随机化处理的益处,并发现令人信服的理论和实验证据表明,这种标准构造可能并非必要。为此,我们证明,表达性预测分布仅需少量随机性即可实现。具体而言,仅包含n个随机偏置的部分随机化网络可作为n维预测问题的通用概率预测器。在实验研究中,我们在四种不同的推断模式和八个数据集上未发现完全随机化具有系统性优势;部分随机化网络尽管内存成本更低,却能与完全随机化网络性能相当,有时甚至更优。