Distribution shifts are problems where the distribution of data changes between training and testing, which can significantly degrade the performance of a model deployed in the real world. Recent studies suggest that one reason for the degradation is a type of overfitting, and that proper regularization can mitigate the degradation, especially when using highly representative models such as neural networks. In this paper, we propose a new regularization using the supervised contrastive learning to prevent such overfitting and to train models that do not degrade their performance under the distribution shifts. We extend the cosine similarity in contrastive loss to a more general similarity measure and propose to use different parameters in the measure when comparing a sample to a positive or negative example, which is analytically shown to act as a kind of margin in contrastive loss. Experiments on benchmark datasets that emulate distribution shifts, including subpopulation shift and domain generalization, demonstrate the advantage of the proposed method over existing regularization methods.
翻译:分布偏移是指数据分布在训练集和测试集之间发生变化的问题,这可能会显著降低部署在实际环境中的模型性能。近期研究表明,性能下降的一个原因是某种过拟合,而适当的正则化可以缓解这种退化,尤其是在使用神经网络等高表示能力模型时。本文提出了一种基于监督对比学习的新型正则化方法,用于防止此类过拟合,并训练出在分布偏移下性能不会下降的模型。我们将对比损失中的余弦相似度扩展为一种更通用的相似性度量,并建议在比较样本与正例或负例时使用该度量中的不同参数,分析表明这相当于在对比损失中引入了一种边际。在模拟分布偏移(包括子群偏移和领域泛化)的基准数据集上的实验证明,所提方法优于现有正则化方法。