Deep Ensembles (DEs) demonstrate improved accuracy, calibration and robustness to perturbations over single neural networks partly due to their functional diversity. Particle-based variational inference (ParVI) methods enhance diversity by formalizing a repulsion term based on a network similarity kernel. However, weight-space repulsion is inefficient due to over-parameterization, while direct function-space repulsion has been found to produce little improvement over DEs. To sidestep these difficulties, we propose First-order Repulsive Deep Ensemble (FoRDE), an ensemble learning method based on ParVI, which performs repulsion in the space of first-order input gradients. As input gradients uniquely characterize a function up to translation and are much smaller in dimension than the weights, this method guarantees that ensemble members are functionally different. Intuitively, diversifying the input gradients encourages each network to learn different features, which is expected to improve the robustness of an ensemble. Experiments on image classification datasets show that FoRDE significantly outperforms the gold-standard DEs and other ensemble methods in accuracy and calibration under covariate shift due to input perturbations.
翻译:深度集成(DE)通过功能多样性部分地提高了单一神经网络的准确性、校准能力和对扰动的鲁棒性。基于粒子的变分推理(ParVI)方法通过基于网络相似性核形式化排斥项来增强多样性。然而,由于过参数化,权重空间中的排斥效率低下,而直接函数空间的排斥被发现对DE的改进微乎其微。为规避这些困难,我们提出一阶排斥深度集成(FoRDE),这是一种基于ParVI的集成学习方法,在一阶输入梯度空间中进行排斥。由于输入梯度唯一地表征了函数(平移不变性),且其维度远小于权重,该方法保证了集成成员在功能上有所区分。直观上,输入梯度的多样性鼓励每个网络学习不同的特征,有望提高集成的鲁棒性。在图像分类数据集上的实验表明,在输入扰动导致的协变量偏移下,FoRDE在准确性和校准能力上显著优于黄金标准的DE及其他集成方法。