Despite their outstanding capabilities, large language models (LLMs) are prone to hallucination and producing factually incorrect information. This challenge has spurred efforts in attributed text generation, which prompts LLMs to generate content with supporting evidence. In this paper, we propose a novel framework, called Think&Cite, and formulate attributed text generation as a multi-step reasoning problem integrated with search. Specifically, we propose Self-Guided Monte Carlo Tree Search (SG-MCTS), which capitalizes on the self-reflection capability of LLMs to reflect on the intermediate states of MCTS for guiding the tree expansion process. To provide reliable and comprehensive feedback, we introduce Progress Reward Models to measure the progress of tree search from the root to the current state from two aspects, i.e., generation and attribution progress. We conduct extensive experiments on three datasets and the results show that our approach significantly outperforms baseline approaches.
翻译:尽管大型语言模型(LLMs)具备卓越能力,但其容易产生幻觉并生成事实错误信息。这一挑战推动了属性文本生成领域的研究,该领域旨在引导LLMs生成附带支持证据的内容。本文提出一种名为Think&Cite的新型框架,将属性文本生成构建为融合搜索的多步推理问题。具体而言,我们提出自引导蒙特卡洛树搜索(SG-MCTS),其利用LLMs的自反思能力对MCTS的中间状态进行反思,从而指导树扩展过程。为提供可靠且全面的反馈,我们引入进展奖励模型,从生成和属性进展两个方面衡量从根节点到当前状态的树搜索进度。我们在三个数据集上进行了大量实验,结果表明我们的方法显著优于基线方法。