Model overconfidence and poor calibration are common in machine learning and difficult to account for when applying standard empirical risk minimization. In this work, we propose a novel method to alleviate these problems that we call odd-$k$-out learning (OKO), which minimizes the cross-entropy error for sets rather than for single examples. This naturally allows the model to capture correlations across data examples and achieves both better accuracy and calibration, especially in limited training data and class-imbalanced regimes. Perhaps surprisingly, OKO often yields better calibration even when training with hard labels and dropping any additional calibration parameter tuning, such as temperature scaling. We provide theoretical justification, establishing that OKO naturally yields better calibration, and provide extensive experimental analyses that corroborate our theoretical findings. We emphasize that OKO is a general framework that can be easily adapted to many settings and the trained model can be applied to single examples at inference time, without introducing significant run-time overhead or architecture changes.
翻译:模型过度自信与校准不良是机器学习中的常见问题,且在应用标准经验风险最小化时难以解释。本文提出一种新型缓解方法——奇k剔除学习(OKO),该方法针对集合而非单个样本最小化交叉熵误差。该方案自然地使模型能够捕获数据样本间的相关性,从而在有限训练数据和类别不平衡场景下同时提升准确性与校准性能。令人惊讶的是,即使采用硬标签训练且不进行温度缩放等额外校准参数调优,OKO仍能实现更优校准效果。我们通过理论论证阐明OKO天然具备更佳校准特性,并通过大量实验分析印证理论发现。需强调,OKO作为通用框架可轻松适配多种场景,训练后的模型在推理阶段可直接应用于单样本,且无需显著增加运行时开销或修改网络结构。