Deep Ensembles (DEs) demonstrate improved accuracy, calibration and robustness to perturbations over single neural networks partly due to their functional diversity. Particle-based variational inference (ParVI) methods enhance diversity by formalizing a repulsion term based on a network similarity kernel. However, weight-space repulsion is inefficient due to over-parameterization, while direct function-space repulsion has been found to produce little improvement over DEs. To sidestep these difficulties, we propose First-order Repulsive Deep Ensemble (FoRDE), an ensemble learning method based on ParVI, which performs repulsion in the space of first-order input gradients. As input gradients uniquely characterize a function up to translation and are much smaller in dimension than the weights, this method guarantees that ensemble members are functionally different. Intuitively, diversifying the input gradients encourages each network to learn different features, which is expected to improve the robustness of an ensemble. Experiments on image classification datasets and transfer learning tasks show that FoRDE significantly outperforms the gold-standard DEs and other ensemble methods in accuracy and calibration under covariate shift due to input perturbations.
翻译:深度集成(DEs)相比单一神经网络展现出更高的准确性、校准性能和对扰动的鲁棒性,部分归因于其功能多样性。基于粒子的变分推断(ParVI)方法通过基于网络相似性核的形式化排斥项增强多样性。然而,由于过度参数化,权重空间排斥效率低下,而直接函数空间排斥被发现对深度集成的改进微乎其微。为规避这些困难,我们提出一阶排斥深度集成(FoRDE),一种基于ParVI的集成学习方法,该方法在一阶输入梯度空间中进行排斥。由于输入梯度唯一表征函数(平移不变性)且维度远小于权重,该方法确保集成成员在功能上存在差异。直观上,输入梯度的多样化会促使每个网络学习不同特征,有望提升集成的鲁棒性。在图像分类数据集和迁移学习任务上的实验表明,FoRDE在输入扰动导致的协变量偏移条件下,其准确性和校准性能显著优于黄金标准深度集成及其他集成方法。