Distantly-Supervised Named Entity Recognition (DS-NER) is widely used in real-world scenarios. It can effectively alleviate the burden of annotation by matching entities in existing knowledge bases with snippets in the text but suffer from the label noise. Recent works attempt to adopt the teacher-student framework to gradually refine the training labels and improve the overall robustness. However, these teacher-student methods achieve limited performance because the poor calibration of the teacher network produces incorrectly pseudo-labeled samples, leading to error propagation. Therefore, we propose: (1) Uncertainty-Aware Teacher Learning that leverages the prediction uncertainty to reduce the number of incorrect pseudo labels in the self-training stage; (2) Student-Student Collaborative Learning that allows the transfer of reliable labels between two student networks instead of indiscriminately relying on all pseudo labels from its teacher, and further enables a full exploration of mislabeled samples rather than simply filtering unreliable pseudo-labeled samples. We evaluate our proposed method on five DS-NER datasets, demonstrating that our method is superior to the state-of-the-art DS-NER methods.
翻译:远程监督命名实体识别(DS-NER)在实际场景中应用广泛。该方法通过将现有知识库中的实体与文本片段进行匹配,能有效减轻标注负担,但同时也面临标签噪声问题。近期研究尝试采用教师-学生框架逐步优化训练标签并提升整体鲁棒性。然而,由于教师网络的校准能力不足会产生错误伪标签样本,导致误差传播,这些教师-学生方法的性能提升有限。为此,我们提出:(1)不确定性感知教师学习,利用预测不确定性减少自训练阶段错误伪标签的数量;(2)学生间协作学习,允许两个学生网络间传递可靠标签,而非无条件依赖教师网络的所有伪标签,并进一步实现对误标样本的充分挖掘而非简单过滤不可靠伪标签样本。我们在五个DS-NER数据集上评估了所提方法,结果表明本方法优于当前最先进的DS-NER方法。