Named Entity Recognition (NER) is an essential steppingstone in the field of natural language processing. Although promising performance has been achieved by various distantly supervised models, we argue that distant supervision inevitably introduces incomplete and noisy annotations, which may mislead the model training process. To address this issue, we propose a robust NER model named BOND-MoE based on Mixture of Experts (MoE). Instead of relying on a single model for NER prediction, multiple models are trained and ensembled under the Expectation-Maximization (EM) framework, so that noisy supervision can be dramatically alleviated. In addition, we introduce a fair assignment module to balance the document-model assignment process. Extensive experiments on real-world datasets show that the proposed method achieves state-of-the-art performance compared with other distantly supervised NER.
翻译:命名实体识别是自然语言处理领域中的关键基础任务。尽管各类远程监督模型已取得了令人瞩目的性能表现,但本文认为远程监督方法不可避免地引入了不完整和带噪声的标注,这可能会误导模型的训练过程。为解决该问题,我们提出了一种基于混合专家机制的鲁棒命名实体识别模型BOND-MoE。该模型并非依赖单一模型进行命名实体预测,而是在期望最大化框架下训练并集成多个模型,从而显著缓解噪声监督带来的干扰。此外,我们引入了一个公平分配模块来平衡文档-模型的分配过程。在真实数据集上进行的广泛实验表明,与其它远程监督命名实体识别方法相比,本文提出的方法取得了最先进的性能表现。