Mitigating the detrimental effects of noisy labels on the training process has become increasingly critical, as obtaining entirely clean or human-annotated samples for large-scale pre-training tasks is often impractical. Nonetheless, existing noise mitigation methods often encounter limitations in practical applications due to their task-specific design, model dependency, and significant computational overhead. In this work, we exploit the properties of high-dimensional orthogonality to identify a robust and effective boundary in cone space for separating clean and noisy samples. Building on this, we propose One-Step Anti-noise (OSA), a model-agnostic noisy label mitigation paradigm that employs an estimator model and a scoring function to assess the noise level of input pairs through just one-step inference. We empirically validate the superiority of OSA, demonstrating its enhanced training robustness, improved task transferability, streamlined deployment, and reduced computational overhead across diverse benchmarks, models, and tasks. Our code is released at https://github.com/leolee99/OSA.
翻译:缓解噪声标签对训练过程的有害影响已变得日益关键,因为为大规模预训练任务获取完全干净或人工标注的样本通常不切实际。然而,现有的噪声缓解方法由于其任务特定设计、模型依赖性和显著的计算开销,在实际应用中常常遇到局限。在本工作中,我们利用高维正交性的特性,在锥空间中识别出一个鲁棒且有效的边界,用于分离干净样本与噪声样本。基于此,我们提出一步抗噪(One-Step Anti-noise, OSA),这是一种与模型无关的噪声标签缓解范式,它采用一个估计器模型和一个评分函数,仅通过一步推理即可评估输入对的噪声水平。我们通过实验验证了OSA的优越性,证明了其在多样化基准、模型和任务中具有增强的训练鲁棒性、改进的任务可迁移性、简化的部署流程以及降低的计算开销。我们的代码发布于 https://github.com/leolee99/OSA。