Large language models (LLMs) have demonstrated impressive abilities in generating unstructured natural language according to instructions. However, their performance can be inconsistent when tasked with producing text that adheres to specific structured formats, which is crucial in applications like named entity recognition (NER) or relation extraction (RE). To address this issue, this paper introduces an efficient method, G&O, to enhance their structured text generation capabilities. It breaks the generation into a two-step pipeline: initially, LLMs generate answers in natural language as intermediate responses. Subsequently, LLMs are asked to organize the output into the desired structure, using the intermediate responses as context. G&O effectively separates the generation of content from the structuring process, reducing the pressure of completing two orthogonal tasks simultaneously. Tested on zero-shot NER and RE, the results indicate a significant improvement in LLM performance with minimal additional efforts. This straightforward and adaptable prompting technique can also be combined with other strategies, like self-consistency, to further elevate LLM capabilities in various structured text generation tasks.
翻译:大语言模型(LLMs)在根据指令生成非结构化自然语言方面展现出令人印象深刻的能力。然而,当需要生成符合特定结构化格式的文本时(这在命名实体识别(NER)或关系抽取(RE)等应用中至关重要),其性能可能不一致。为解决此问题,本文提出一种高效方法G&O,以增强其结构化文本生成能力。该方法将生成过程分解为两步流水线:首先,LLMs以自然语言生成答案作为中间响应;随后,LLMs被要求基于中间响应作为上下文,将输出组织为所需结构。G&O有效分离了内容生成与结构化过程,降低了同时完成两项正交任务的负担。在零样本NER和RE上的测试结果表明,通过极少的额外努力,LLM性能得到显著提升。这种简单且适应性强的提示技术还可与自一致性等其他策略相结合,进一步提升LLM在各种结构化文本生成任务中的能力。