Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it. However, it is nontrivial to perform information extraction (IE) tasks with NL-LLMs since the output of the IE task is usually structured and therefore is hard to be converted into plain text. In this paper, we propose to recast the structured output in the form of code instead of natural language and utilize generative LLMs of code (Code-LLMs) such as Codex to perform IE tasks, in particular, named entity recognition and relation extraction. In contrast to NL-LLMs, we show that Code-LLMs can be well-aligned with these IE tasks by designing code-style prompts and formulating these IE tasks as code generation tasks. Experiment results on seven benchmarks show that our method consistently outperforms fine-tuning moderate-size pre-trained models specially designed for IE tasks (e.g., UIE) and prompting NL-LLMs under few-shot settings. We further conduct a series of in-depth analyses to demonstrate the merits of leveraging Code-LLMs for IE tasks.
翻译:大型语言模型(LLMs)在大规模语料库上预训练后,在许多自然语言处理任务中展示了令人印象深刻的少样本学习能力。一种常见做法是将任务重新构造为文本到文本格式,从而可以提示诸如GPT-3之类的自然语言生成型LLM(NL-LLM)来解决该任务。然而,使用NL-LLM执行信息提取(IE)任务并非易事,因为IE任务的输出通常是结构化的,因此难以转换为纯文本。在本文中,我们提出将结构化输出以代码形式而非自然语言形式重新构造,并利用代码生成型LLM(Code-LLM)(例如Codex)执行IE任务,特别是命名实体识别和关系抽取。与NL-LLM相比,我们通过设计代码风格的提示并将这些IE任务制定为代码生成任务,展示了Code-LLM能够与这些IE任务良好对齐。在七个基准测试上的实验结果表明,在少样本设置下,我们的方法始终优于针对IE任务专门设计的微调中等规模预训练模型(例如UIE)以及提示NL-LLM的方法。我们进一步进行了一系列深入分析,以证明利用Code-LLM进行IE任务的优点。