Deductive Verification of Chain-of-Thought Reasoning

Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at https://github.com/lz1oceani/verify_cot.

翻译：大语言模型（LLM）通过链式思维（CoT）提示在各种推理任务中显著受益。虽然CoT使模型能够生成更全面的推理过程，但其对中间推理步骤的强调可能无意中引入幻觉和累积误差，从而限制模型解决复杂推理任务的能力。受人类通过细致严谨的演绎逻辑推理过程解决问题的启发，我们试图使语言模型能够进行明确且严格的演绎推理，并通过自我验证确保其推理过程的可信度。然而，直接验证整个演绎推理过程的正确性具有挑战性，即使是ChatGPT等先进模型也难以做到。鉴于此，我们提出将推理验证过程分解为一系列逐步的子过程，每个子过程仅接收其必要的上下文和前提。为促进这一过程，我们提出了“自然程序”（Natural Program）——一种基于自然语言的演绎推理格式。我们的方法使模型能够生成精确的推理步骤，确保后续步骤更严格地基于先前步骤。同时，它使语言模型能够以逐步方式进行推理自我验证。通过将这一验证过程集成到每个演绎推理阶段，我们显著提升了生成推理步骤的严谨性和可信度。在此过程中，我们还提高了复杂推理任务的答案正确性。代码将在https://github.com/lz1oceani/verify_cot 发布。