Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty functions as part of the learning objective, alongside a standard classification loss, with a hyper-parameter controlling the relative contribution of each term. Nevertheless, these methods share two major drawbacks: 1) the scalar balancing weight is the same for all classes, hindering the ability to address different intrinsic difficulties or imbalance among classes; and 2) the balancing weight is usually fixed without an adaptive strategy, which may prevent from reaching the best compromise between accuracy and calibration, and requires hyper-parameter search for each application. We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks, which allows to learn class-wise multipliers during training, yielding a powerful alternative to common label smoothing penalties. Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization, but we introduce several modifications to tailor it for large-scale, class-adaptive training. Comprehensive evaluation and multiple comparisons on a variety of benchmarks, including standard and long-tailed image classification, semantic segmentation, and text classification, demonstrate the superiority of the proposed method. The code is available at https://github.com/by-liu/CALS.
翻译:近期研究表明,除了传统的准确率之外,训练现代深度神经网络时还应考虑校准问题。为解决学习过程中的失校准现象,部分方法在标准分类损失之外探索了不同的惩罚函数作为学习目标的一部分,并通过超参数控制各项的相对贡献。然而,这些方法存在两大缺陷:1)标量平衡权重对所有类别相同,难以应对不同类别的固有差异或类别不平衡;2)平衡权重通常固定不变,缺乏自适应策略,这可能阻碍准确率与校准之间的最佳平衡,且需要针对每个应用场景进行超参数搜索。我们提出类自适应标签平滑(CALS)方法用于深度网络校准,该方法可在训练过程中学习逐类乘性因子,为常见的标签平滑惩罚提供了强有力的替代方案。本方法基于约束优化领域成熟的增广拉格朗日技术框架,但引入了多项改进以适配大规模类自适应训练场景。在包括标准与长尾图像分类、语义分割及文本分类在内的多种基准测试上的综合评估与多重对比,证明了所提方法的优越性。代码开源地址:https://github.com/by-liu/CALS。