Model overconfidence and poor calibration are common in machine learning and difficult to account for when applying standard empirical risk minimization. In this work, we propose a novel method to alleviate these problems that we call odd-$k$-out learning (OKO), which minimizes the cross-entropy error for sets rather than for single examples. This naturally allows the model to capture correlations across data examples and achieves both better accuracy and calibration, especially in limited training data and class-imbalanced regimes. Perhaps surprisingly, OKO often yields better calibration even when training with hard labels and dropping any additional calibration parameter tuning, such as temperature scaling. We demonstrate this in extensive experimental analyses and provide a mathematical theory to interpret our findings. We emphasize that OKO is a general framework that can be easily adapted to many settings and a trained model can be applied to single examples at inference time, without significant run-time overhead or architecture changes.
翻译:模型过度自信及校准不佳是机器学习中的常见问题,且在应用标准经验风险最小化时难以解释。本文提出一种新颖的缓解方法——奇-k-出学习(OKO),该方法通过最小化集合而非单一示例的交叉熵误差,自然允许模型捕捉数据示例间的相关性,从而在训练数据有限及类别不平衡场景中,实现更优的准确性与校准效果。出乎意料的是,即使采用硬标签训练且省略温度缩放等额外校准参数调优,OKO通常仍能获得更佳的校准性能。我们通过大量实验分析验证了该结论,并建立了数学理论解释相关发现。需要强调的是,OKO是一个通用框架,可便捷适配多种场景,且训练后的模型在推理时可直接应用于单一示例,无需显著运行开销或架构调整。