We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification that are formulated from distributionally robust optimization (DRO) perspective, where the uncertainty in the given label information are modeled and captured by taking the worse case of distributional weights. The benefits of this perspective are several fold: (i) it provides a unified framework to explain the classical cross-entropy (CE) loss and SVM loss and their variants, (ii) it includes a special family corresponding to the temperature-scaled CE loss, which is widely adopted but poorly understood; (iii) it allows us to achieve adaptivity to the uncertainty degree of label information at an instance level. Our contributions include: (1) we study both consistency and robustness by establishing top-$k$ ($\forall k\geq 1$) consistency of LDR losses for multi-class classification, and a negative result that a top-$1$ consistent and symmetric robust loss cannot achieve top-$k$ consistency simultaneously for all $k\geq 2$; (2) we propose a new adaptive LDR loss that automatically adapts the individualized temperature parameter to the noise degree of class label of each instance; (3) we demonstrate stable and competitive performance for the proposed adaptive LDR loss on 7 benchmark datasets under 6 noisy label and 1 clean settings against 13 loss functions, and on one real-world noisy dataset. The code is open-sourced at \url{https://github.com/Optimization-AI/ICML2023_LDR}.
翻译:本文研究一类名为标签分布鲁棒(LDR)损失的多类分类损失函数,该类损失函数基于分布鲁棒优化(DRO)视角构建,通过考虑分布权重的极端情况对给定标签信息中的不确定性进行建模与捕捉。该视角具有多重优势:(i)为经典交叉熵(CE)损失、SVM损失及其变体提供了统一解释框架;(ii)包含一个对应于温度缩放交叉熵损失的特殊子类,该损失被广泛采用但缺乏深入理解;(iii)允许在实例层面实现对标签信息不确定程度的自适应性。我们的贡献包括:(1)通过建立LDR损失对多类分类的top-$k$($\forall k\geq 1$)一致性,研究其一致性与鲁棒性,并证明了一个负面结论:top-1一致且对称鲁棒的损失无法同时对所有$k\geq 2$实现top-$k$一致性;(2)提出一种自适应LDR损失,可自动将个性化温度参数调整至每个实例类别标签的噪声程度;(3)在6种含噪标签和1种干净标签设置下的7个基准数据集上,与13种损失函数进行对比实验,并在一个真实含噪数据集上验证了所提自适应LDR损失的稳定且竞争性的性能。代码已开源:\url{https://github.com/Optimization-AI/ICML2023_LDR}。