Model overconfidence and poor calibration are common in machine learning and difficult to account for when applying standard empirical risk minimization. In this work, we propose a novel method to alleviate these problems that we call odd-$k$-out learning (OKO), which minimizes the cross-entropy error for sets rather than for single examples. This naturally allows the model to capture correlations across data examples and achieves both better accuracy and calibration, especially in limited training data and class-imbalanced regimes. Perhaps surprisingly, OKO often yields better calibration even when training with hard labels and dropping any additional calibration parameter tuning, such as temperature scaling. We provide theoretical justification, establishing that OKO naturally yields better calibration, and provide extensive experimental analyses that corroborate our theoretical findings. We emphasize that OKO is a general framework that can be easily adapted to many settings and the trained model can be applied to single examples at inference time, without introducing significant run-time overhead or architecture changes.
翻译:模型过度自信与校准不良是机器学习中的常见问题,且在应用标准经验风险最小化时难以解决。本文提出一种名为奇数-$k$-元组学习(OKO)的新型方法,通过最小化集合而非单样本的交叉熵误差来缓解这些问题。该方法能自然捕捉数据样本间的相关性,在训练数据有限和类别不平衡的场景下同时提升准确率与校准性能。令人惊讶的是,即便使用硬标签训练且不进行温度缩放等额外校准参数调优,OKO通常仍能实现更优的校准效果。我们提供了理论证明,论证OKO天然具备更好的校准特性,并通过大量实验分析验证了理论发现。需要强调的是,OKO是一个通用框架,可灵活适配多种应用场景,且训练后的模型在推理阶段可直接处理单样本,无需引入显著运行时开销或改变模型架构。