Label error is a ubiquitous problem in annotated data. Large amounts of label error substantially degrades the quality of deep learning models. Existing methods to tackle the label error problem largely focus on the classification task, and either rely on task specific architecture or require non-trivial additional computations, which is undesirable or even unattainable for industry usage. In this paper, we propose LEDO: a model-agnostic and computationally efficient framework for Label Error Detection and Overwrite. LEDO is based on Monte Carlo Dropout combined with uncertainty metrics, and can be easily generalized to multiple tasks and data sets. Applying LEDO to an industry opinion-based question answering system demonstrates it is effective at improving accuracy in all the core models. Specifically, LEDO brings 1.1% MRR gain for the retrieval model, 1.5% PR AUC improvement for the machine reading comprehension model, and 0.9% rise in the Average Precision for the ranker, on top of the strong baselines with a large-scale social media dataset. Importantly, LEDO is computationally efficient compared to methods that require loss function change, and cost-effective as the resulting data can be used in the same continuous training pipeline for production. Further analysis shows that these gains come from an improved decision boundary after cleaning the label errors existed in the training data.
翻译:标签错误是标注数据中普遍存在的问题。大量标签错误会显著降低深度学习模型的质量。现有处理标签错误问题的方法主要聚焦于分类任务,且要么依赖任务特定架构,要么需要大量额外计算,这在工业应用中既不理想甚至难以实现。本文提出LEDO:一种模型无关且计算高效的标签错误检测与覆盖框架。LEDO基于蒙特卡洛Dropout结合不确定性度量,可轻松推广至多种任务和数据集。将LEDO应用于工业级观点问答系统,证明其能有效提升所有核心模型的准确率。具体而言,基于大规模社交媒体数据集,LEDO在强基线模型基础上为检索模型带来1.1%的MRR增益,为机器阅读理解模型带来1.5%的PR AUC提升,并为排序器带来0.9%的平均精度提升。重要的是,相比需要修改损失函数的方法,LEDO计算效率更高,且由于处理后的数据可直接用于同一持续训练管道投产,具有成本效益。进一步分析表明,这些提升源于清除训练数据中存在的标签错误后决策边界的优化。