Learning with noisy labels (LNL) poses a significant challenge in training a well-generalized model while avoiding overfitting to corrupted labels. Recent advances have achieved impressive performance by identifying clean labels and correcting corrupted labels for training. However, the current approaches rely heavily on the model's predictions and evaluate each sample independently without considering either the global and local structure of the sample distribution. These limitations typically result in a suboptimal solution for the identification and correction processes, which eventually leads to models overfitting to incorrect labels. In this paper, we propose a novel optimal transport (OT) formulation, called Curriculum and Structure-aware Optimal Transport (CSOT). CSOT concurrently considers the inter- and intra-distribution structure of the samples to construct a robust denoising and relabeling allocator. During the training process, the allocator incrementally assigns reliable labels to a fraction of the samples with the highest confidence. These labels have both global discriminability and local coherence. Notably, CSOT is a new OT formulation with a nonconvex objective function and curriculum constraints, so it is not directly compatible with classical OT solvers. Here, we develop a lightspeed computational method that involves a scaling iteration within a generalized conditional gradient framework to solve CSOT efficiently. Extensive experiments demonstrate the superiority of our method over the current state-of-the-arts in LNL. Code is available at https://github.com/changwxx/CSOT-for-LNL.
翻译:噪声标签学习(LNL)在训练泛化模型时面临重大挑战,需避免对损坏标签的过拟合。现有方法通过识别干净标签并校正损坏标签取得了显著进展,但这些方法过度依赖模型预测结果,且独立评估每个样本,未考虑样本分布的全局与局部结构。此类局限性常导致识别与校正过程陷入次优解,最终使模型对错误标签过拟合。本文提出一种新颖的最优传输(OT)框架——课程与结构感知最优传输(CSOT)。CSOT同时考虑样本的分布间与分布内结构,构建鲁棒的降噪与重标注分配器。在训练过程中,分配器逐步为置信度最高的部分样本分配可靠标签,这些标签兼具全局判别性与局部一致性。值得注意的是,CSOT作为新型OT框架,其非凸目标函数与课程约束使其无法直接兼容经典OT求解器。为此,我们开发了一种基于广义条件梯度框架的快速计算方法,通过缩放迭代高效求解CSOT。大量实验证明,本方法在LNL任务中显著优于现有最优技术。代码已开源:https://github.com/changwxx/CSOT-for-LNL。