Model overconfidence and poor calibration are common in machine learning and difficult to account for when applying standard empirical risk minimization. In this work, we propose a novel method to alleviate these problems that we call odd-$k$-out learning (OKO), which minimizes the cross-entropy error for sets rather than for single examples. This naturally allows the model to capture correlations across data examples and achieves both better accuracy and calibration, especially in limited training data and class-imbalanced regimes. Perhaps surprisingly, OKO often yields better calibration even when training with hard labels and dropping any additional calibration parameter tuning, such as temperature scaling. We provide theoretical justification, establishing that OKO naturally yields better calibration, and provide extensive experimental analyses that corroborate our theoretical findings. We emphasize that OKO is a general framework that can be easily adapted to many settings and the trained model can be applied to single examples at inference time, without introducing significant run-time overhead or architecture changes.
翻译:模型过度自信与校准不良是机器学习中的常见问题,且在应用标准经验风险最小化时难以解决。本文提出一种名为奇-k-出学习(OKO)的新方法,通过最小化集合而非单样本的交叉熵误差来缓解这些问题。该方法天然促使模型捕捉数据样本间的相关性,从而在有限训练数据和类别不平衡场景下同时提升准确率与校准性能。令人惊讶的是,即使使用硬标签训练且不调整温度缩放等额外校准参数,OKO通常也能获得更优校准效果。我们给出了理论证明,阐明OKO天然具有更优校准特性,并通过大量实验分析佐证理论发现。需要强调的是,OKO是一个通用框架,可轻松适配多种场景,且训练后的模型在推理阶段可直接用于单样本预测,无需引入显著运行时开销或架构变更。