In recent years, research on learning with noisy labels has focused on devising novel algorithms that can achieve robustness to noisy training labels while generalizing to clean data. These algorithms often incorporate sophisticated techniques, such as noise modeling, label correction, and co-training. In this study, we demonstrate that a simple baseline using cross-entropy loss, combined with widely used regularization strategies like learning rate decay, model weights average, and data augmentations, can outperform state-of-the-art methods. Our findings suggest that employing a combination of regularization strategies can be more effective than intricate algorithms in tackling the challenges of learning with noisy labels. While some of these regularization strategies have been utilized in previous noisy label learning research, their full potential has not been thoroughly explored. Our results encourage a reevaluation of benchmarks for learning with noisy labels and prompt reconsideration of the role of specialized learning algorithms designed for training with noisy labels.
翻译:近年来,关于含噪标签学习的研究主要聚焦于设计新型算法,使其在泛化至纯净数据的同时,能够对含噪训练标签具有鲁棒性。这些算法通常融合了噪声建模、标签校正与协同训练等复杂技术。本研究表明,基于交叉熵损失的简单基线方法,结合学习率衰减、模型权重平均及数据增强等广泛使用的正则化策略,即可超越现有最先进方法。我们的发现表明,采用组合正则化策略比复杂的算法更能有效应对含噪标签学习的挑战。尽管部分正则化策略此前已被用于含噪标签学习研究,但其全部潜力尚未得到充分探索。本研究的结论鼓励重新评估含噪标签学习的基准,并促使学界重新审视专为含噪标签训练设计的特殊学习算法的作用。