We present a method to improve the calibration of deep ensembles in the small training data regime in the presence of unlabeled data. Our approach is extremely simple to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that if we fit such a labeling on unlabeled data, and the true labels on the training data, we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, our ensembles are more diverse and provide better calibration than standard ensembles, sometimes significantly.
翻译:我们提出一种方法,旨在利用无标注数据改善小训练数据规模下深度集成的校准效果。本方法实现极为简单:对于给定的无标注数据集,对每个无标注数据点,每个集成成员仅随机拟合一个不同的标签。我们基于PAC-Bayes界给出理论分析,保证若在无标注数据上拟合此类标签,并在训练数据上拟合真实标签,测试样本将获得较低的负对数似然与较高的集成多样性。实验方面,通过详细验证,我们发现在中低规模训练集上,本集成方法相比标准集成具有更高的多样性与更优的校准效果,有时提升显著。