Named Entity Recognition (NER) serves as a fundamental task in natural language understanding, bearing direct implications for web content analysis, search engines, and information retrieval systems. Fine-tuned NER models exhibit satisfactory performance on standard NER benchmarks. However, due to limited fine-tuning data and lack of knowledge, it performs poorly on unseen entity recognition. As a result, the usability and reliability of NER models in web-related applications are compromised. Instead, Large Language Models (LLMs) like GPT-4 possess extensive external knowledge, but research indicates that they lack specialty for NER tasks. Furthermore, non-public and large-scale weights make tuning LLMs difficult. To address these challenges, we propose a framework that combines small fine-tuned models with LLMs (LinkNER) and an uncertainty-based linking strategy called RDC that enables fine-tuned models to complement black-box LLMs, achieving better performance. We experiment with both standard NER test sets and noisy social media datasets. LinkNER enhances NER task performance, notably surpassing SOTA models in robustness tests. We also quantitatively analyze the influence of key components like uncertainty estimation methods, LLMs, and in-context learning on diverse NER tasks, offering specific web-related recommendations.
翻译:命名实体识别(NER)是自然语言理解中的基础任务,对网页内容分析、搜索引擎和信息检索系统具有直接影响。经过微调的NER模型在标准NER基准测试中表现出令人满意的性能。然而,由于微调数据有限且缺乏知识,这些模型在未见实体识别方面表现不佳,从而影响了NER模型在Web相关应用中的可用性和可靠性。相比之下,像GPT-4这样的大语言模型(LLMs)拥有广泛的外部知识,但研究表明它们缺乏针对NER任务的专业性。此外,非公开且规模庞大的权重使得调整LLMs变得困难。为了解决这些挑战,我们提出了一种结合小型微调模型与LLMs的框架(LinkNER)以及一种基于不确定性的链接策略(称为RDC),该策略使微调模型能够补充黑盒LLMs,从而实现更优性能。我们在标准NER测试集和嘈杂的社交媒体数据集上进行了实验。LinkNER增强了NER任务性能,在鲁棒性测试中尤为显著地超越了现有最优(SOTA)模型。我们还定量分析了不确定性估计方法、LLMs和上下文学习等关键组件对不同NER任务的影响,并提供了针对Web场景的具体建议。