In the presence of noisy labels, designing robust loss functions is critical for securing the generalization performance of deep neural networks. Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness. To alleviate this issue, existing works typically design specialized robust losses with the symmetric condition, which usually lead to the underfitting issue. In this paper, our key idea is to induce a loss bound at the logit level, thus universally enhancing the noise robustness of existing losses. Specifically, we propose logit clipping (LogitClip), which clamps the norm of the logit vector to ensure that it is upper bounded by a constant. In this manner, CE loss equipped with our LogitClip method is effectively bounded, mitigating the overfitting to examples with noisy labels. Moreover, we present theoretical analyses to certify the noise-tolerant ability of LogitClip. Extensive experiments show that LogitClip not only significantly improves the noise robustness of CE loss, but also broadly enhances the generalization performance of popular robust losses.
翻译:在存在噪声标签的情况下,设计鲁棒损失函数对于保障深度神经网络的泛化性能至关重要。交叉熵(Cross Entropy, CE)损失因其无界性已被证明对噪声标签不鲁棒。为缓解此问题,现有工作通常设计满足对称条件的专用鲁棒损失,但往往导致欠拟合问题。本文的核心思想是在logit层面引入损失上界,从而普遍增强现有损失的噪声鲁棒性。具体而言,我们提出logit裁剪(LogitClip)方法,该方法通过约束logit向量的范数使其被常数上限所界定。通过这种方式,配备LogitClip方法的CE损失被有效限制,从而缓解了对含噪声标签样本的过拟合。此外,我们提供了理论分析以证明LogitClip的噪声容忍能力。大量实验表明,LogitClip不仅显著提升了CE损失的噪声鲁棒性,还广泛增强了主流鲁棒损失的泛化性能。