Information extraction (IE) aims to extract structural knowledge (such as entities, relations, and events) from plain natural language texts. Recently, generative Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation, allowing for generalization across various domains and tasks. As a result, numerous works have been proposed to harness abilities of LLMs and offer viable solutions for IE tasks based on a generative paradigm. To conduct a comprehensive systematic review and exploration of LLM efforts for IE tasks, in this study, we survey the most recent advancements in this field. We first present an extensive overview by categorizing these works in terms of various IE subtasks and learning paradigms, then we empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs. Based on thorough review conducted, we identify several insights in technique and promising research directions that deserve further exploration in future studies. We maintain a public repository and consistently update related resources at: \url{https://github.com/quqxui/Awesome-LLM4IE-Papers}.
翻译:信息抽取(IE)旨在从自然语言文本中提取结构化知识(如实体、关系和事件)。近年来,生成式大语言模型(LLMs)在文本理解与生成方面展现出卓越能力,能够实现跨领域与跨任务的泛化。因此,众多研究尝试利用LLMs的能力,为基于生成范式的IE任务提供可行解决方案。为全面系统地回顾和探索LLMs在IE任务中的研究进展,本文综述了该领域的最新成果。我们首先通过按不同IE子任务和学习范式对这些工作进行分类,从而提供详尽的概览;随后通过实证分析最先进的方法,揭示LLMs应用于IE任务的新兴趋势。基于深入梳理,我们总结了多项技术洞见,并指出了未来研究中值得进一步探索的潜在研究方向。我们在以下公开仓库中持续维护和更新相关资源:\url{https://github.com/quqxui/Awesome-LLM4IE-Papers}。