Two lines of approaches are adopted for complex reasoning with LLMs. One line of work prompts LLMs with various reasoning structures, while the structural outputs can be naturally regarded as intermediate reasoning steps. Another line of work adopt LLM-free declarative solvers to do the reasoning task, rendering higher reasoning accuracy but lacking interpretability due to the black-box nature of the solvers. Aiming to resolve the trade-off between answer accuracy and interpretability, we present a simple extension to the latter line of work. Specifically, we showcase that the intermediate search logs generated by Prolog interpreters can be accessed and interpreted into human-readable reasoning proofs. As long as LLMs correctly translate problem descriptions into Prolog representations, the corresponding reasoning proofs are ensured to be causal and reliable. On two logical reasoning and one arithmetic reasoning datasets, our framework obtains significant improvements in terms of both answer accuracy and reasoning proof accuracy. Our code is released at https://github.com/DAMO-NLP-SG/CaRing
翻译:针对大型语言模型(LLM)的复杂推理任务,现有研究主要沿循两种路径。一类工作通过设计多样化推理结构提示LLM,其结构化输出可自然视为中间推理步骤。另一类工作采用无需LLM的声明式求解器执行推理任务,虽能获得更高推理准确率,但受求解器黑箱特性所限而缺乏可解释性。为平衡答案准确性与可解释性之间的权衡,我们对后一研究路径提出简洁扩展方案。具体而言,我们证明Prolog解释器生成的中间搜索日志可被解析为人类可读的推理证明。只要LLM能正确将问题描述转换为Prolog表示,相应推理证明即可确保具备因果性与可靠性。在两个逻辑推理数据集和一个算术推理数据集上的实验表明,我们的框架在答案准确率与推理证明准确率方面均取得显著提升。代码已发布于https://github.com/DAMO-NLP-SG/CaRing