Large language models (LLMs) have shown great abilities of solving various natural language tasks in different domains. Due to the training objective of LLMs and their pre-training data, LLMs are not very well equipped for tasks involving structured data generation. We propose a framework, Prompting with Iterative Verification (PiVe), to improve graph-based generative capability of LLMs. We show how a small language model could be trained to act as a verifier module for the output of an LLM(i.e., ChatGPT, GPT-4), and to iteratively improve its performance via fine-grained corrective instructions. We also show how the verifier module could apply iterative corrections offline for a more cost-effective solution to the text-to-graph generation task. Experiments on three graph-based datasets show consistent improvement gained via PiVe. Additionally, we create GenWiki-HIQ and highlight that the verifier module can be used as a data augmentation tool to help improve the quality of automatically generated parallel text-graph datasets.
翻译:大语言模型(LLMs)在不同领域的各类自然语言任务中展现出强大能力。然而,受限于其训练目标和预训练数据,LLMs在涉及结构化数据生成的任务中表现欠佳。我们提出一种名为“迭代验证提示”(PiVe)的框架,旨在提升LLMs的图生成能力。研究表明,小型语言模型可被训练为LLM(如ChatGPT、GPT-4)输出的验证模块,并通过细粒度修正指令迭代优化其性能。同时,我们验证了验证模块可通过离线迭代修正实现文本到图生成任务的低成本解决方案。在三个基于图的数据集上的实验表明,PiVe方法能够持续提升性能。此外,我们构建了GenWiki-HIQ数据集,并指出验证模块可作为数据增强工具,用于提升自动生成的平行文本-图数据集的质量。