Equipped with Chain-of-Thought (CoT), Large language models (LLMs) have shown impressive reasoning ability in various downstream tasks. Even so, suffering from hallucinations and the inability to access external knowledge, LLMs often come with incorrect or unfaithful intermediate reasoning steps, especially in the context of answering knowledge-intensive tasks such as KBQA. To alleviate this issue, we propose a framework called Knowledge-Driven Chain-of-Thought (KD-CoT) to verify and modify reasoning traces in CoT via interaction with external knowledge, and thus overcome the hallucinations and error propagation. Concretely, we formulate the CoT rationale process of LLMs into a structured multi-round QA format. In each round, LLMs interact with a QA system that retrieves external knowledge and produce faithful reasoning traces based on retrieved precise answers. The structured CoT reasoning of LLMs is facilitated by our developed KBQA CoT collection, which serves as in-context learning demonstrations and can also be utilized as feedback augmentation to train a robust retriever. Extensive experiments on WebQSP and ComplexWebQuestion datasets demonstrate the effectiveness of proposed KD-CoT in task-solving reasoning generation, which outperforms the vanilla CoT ICL with an absolute success rate of 8.0% and 5.1%. Furthermore, our proposed feedback-augmented retriever outperforms the state-of-the-art baselines for retrieving knowledge, achieving significant improvement in Hit performance.
翻译:配备思维链(CoT)的大语言模型(LLMs)在各类下游任务中展现出令人瞩目的推理能力。尽管取得显著成效,但受限于幻觉现象与无法获取外部知识,LLMs在回答知识密集型任务(如知识库问答KBQA)时,常会生成错误或不可靠的中间推理步骤。为缓解该问题,我们提出名为"知识驱动思维链"(KD-CoT)的框架,通过外部知识交互对CoT推理轨迹进行验证与修正,从而克服幻觉与错误传播。具体而言,我们将LLMs的思维链推理过程转化为结构化的多轮问答格式。每轮迭代中,LLMs与检索外部知识的问答系统交互,基于检索到的精准答案生成可靠推理轨迹。我们开发的知识库问答思维链数据集为LLMs的结构化CoT推理提供上下文学习范例,同时可扩展为反馈增强信号,用于训练鲁棒的检索器。在WebQSP与ComplexWebQuestion数据集上的大量实验表明,所提出的KD-CoT框架在任务求解推理生成中具有显著有效性,相较原始CoT上下文学习分别实现8.0%与5.1%的绝对成功率提升。此外,我们提出的反馈增强检索器在知识检索任务中超越当前最优基线方法,命中率指标取得显著改进。