Existing research on large language models (LLMs) shows that they can solve information extraction tasks through multi-step planning. However, their extraction behavior on complex sentences and tasks is unstable, emerging issues such as false positives and missing elements. We observe that decomposing complex extraction tasks and extracting them step by step can effectively improve LLMs' performance, and the extraction orders of entities significantly affect the final results of LLMs. This paper proposes a two-stage multi-step method for LLM-based information extraction and adopts the RL framework to execute the multi-step planning. We regard sequential extraction as a Markov decision process, build an LLM-based extraction environment, design a decision module to adaptively provide the optimal order for sequential entity extraction on different sentences, and utilize the DDQN algorithm to train the decision model. We also design the rewards and evaluation metrics suitable for the extraction results of LLMs. We conduct extensive experiments on multiple public datasets to demonstrate the effectiveness of our method in improving the information extraction capabilities of LLMs.
翻译:现有关于大语言模型(LLMs)的研究表明,其能够通过多步规划解决信息抽取任务。然而,其在复杂句子与任务上的抽取行为存在不稳定性,易出现误报和元素遗漏等问题。我们观察到,将复杂抽取任务分解并逐步抽取能有效提升LLMs的性能,且实体的抽取顺序会显著影响LLMs的最终结果。本文提出一种基于LLM的信息抽取两阶段多步方法,并采用强化学习框架执行多步规划。我们将序列化抽取视为马尔可夫决策过程,构建基于LLM的抽取环境,设计决策模块以自适应地为不同句子中的序列实体抽取提供最优顺序,并利用DDQN算法训练决策模型。同时,我们设计了适用于LLMs抽取结果的奖励函数与评估指标。通过在多个公开数据集上的大量实验,验证了本方法在提升LLMs信息抽取能力方面的有效性。