Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cross-Epoch Counting (CEC) and correspond to the early and late stages of training, respectively. Additionally, we categorize samples based on their prediction results into three categories: inconsistent samples, hard samples, and easy samples. During training, we gradually increase the difficulty of hard samples to update model parameters, preventing noisy labels from being overfitted. Compared to contrastive schemes, our approach not only achieves the best performance in speaker verification but also excels in noisy label detection.
翻译:即使在标注良好的数据集中,噪声标签也难以避免。检测噪声标签对于提升说话人识别模型的鲁棒性具有重要意义。本文提出一种基于两个新型统计指标的噪声标签检测方法:连续不一致计数(CIC)与总不一致计数(TIC)。这些指标通过跨周期计数(CEC)计算得出,分别对应训练过程的早期阶段与后期阶段。此外,我们根据样本的预测结果将其划分为三类:不一致样本、困难样本与简单样本。在训练过程中,我们逐步增加困难样本的难度以更新模型参数,从而防止模型对噪声标签产生过拟合。与对比方案相比,我们的方法不仅在说话人验证任务中取得了最佳性能,同时在噪声标签检测方面表现优异。