Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## marks the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
翻译:尽管大规模语言模型(LLM)已在多种自然语言处理任务中达到当前最优性能,但其在命名实体识别(NER)任务上的表现仍显著低于有监督基线方法。这一差距源于NER任务本质为序列标注任务,而LLM属于文本生成模型。本文提出GPT-NER框架以解决该问题。GPT-NER通过将序列标注任务转化为易被LLM适配的生成任务来弥合鸿沟——例如,将输入文本"Columbus is a city"中的地点实体识别任务转化为生成文本序列"@@Columbus## is a city",其中特殊标记@@##标记待提取实体。针对LLM的"幻觉"问题(即模型倾向于过度自信地将空值输入标记为实体),我们提出一种自验证策略:通过提示LLM自身验证所提取实体是否属于已标注的实体标签。在五个广泛采用的NER数据集上的实验表明,GPT-NER取得了与有监督基线方法相当的性能——据我们所知,这是首次实现这一突破。更关键的是,我们发现GPT-NER在低资源和少样本场景下展现出更强的能力:当训练数据极度匮乏时,GPT-NER的表现显著优于监督模型。这证实了GPT-NER在标注样本数量受限的真实世界NER应用中的潜力。