The MultiCoNER \RNum{2} shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios, and it inherits the semantic ambiguity and low-context setting of the MultiCoNER \RNum{1} task. To cope with these problems, the previous top systems in the MultiCoNER \RNum{1} either incorporate the knowledge bases or gazetteers. However, they still suffer from insufficient knowledge, limited context length, single retrieval strategy. In this paper, our team \textbf{DAMO-NLP} proposes a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER. We perform error analysis on the previous top systems and reveal that their performance bottleneck lies in insufficient knowledge. Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model. To enhance the retrieval context, we incorporate the entity-centric Wikidata knowledge base, while utilizing the infusion approach to broaden the contextual scope of the model. Also, we explore various search strategies and refine the quality of retrieval knowledge. Our system\footnote{We will release the dataset, code, and scripts of our system at {\small \url{https://github.com/modelscope/AdaSeq/tree/master/examples/U-RaNER}}.} wins 9 out of 13 tracks in the MultiCoNER \RNum{2} shared task. Additionally, we compared our system with ChatGPT, one of the large language models which have unlocked strong capabilities on many tasks. The results show that there is still much room for improvement for ChatGPT on the extraction task.
翻译:MultiCoNER \RNum{2} 共享任务旨在解决细粒度及噪声场景下的多语言命名实体识别(NER)问题,并继承了MultiCoNER \RNum{1} 任务中的语义歧义性与低上下文设定。为应对这些问题,此前MultiCoNER \RNum{1} 的顶级系统通常引入知识库或地名词典。然而,这些方法仍面临知识不足、上下文长度受限及检索策略单一等局限。本文中,我们的团队 **DAMO-NLP** 提出了一种面向细粒度多语言NER的统一检索增强系统(U-RaNER)。我们对先前顶级系统进行错误分析,揭示其性能瓶颈在于知识匮乏。同时,我们发现受限的上下文长度导致检索知识对模型不可见。为增强检索上下文,我们整合了以实体为中心的维基百科知识库,并利用注入方法扩展模型的上下文范围。此外,我们探索了多种搜索策略,优化了检索知识的质量。我们的系统\footnote{我们将公开系统数据集、代码及脚本,地址为 {\small \url{https://github.com/modelscope/AdaSeq/tree/master/examples/U-RaNER}}。} 在MultiCoNER \RNum{2} 共享任务的13个赛道中赢得了9个冠军。此外,我们将系统与ChatGPT(一种已在诸多任务中展现出强大能力的大语言模型)进行对比,结果表明ChatGPT在抽取任务上仍有显著改进空间。