Although large language models (LLMs) have achieved excellent performance in a variety of evaluation benchmarks, they still struggle in complex reasoning tasks which require specific knowledge and multi-hop reasoning. To improve the reasoning abilities, we propose ChatCoT, a tool-augmented chain-of-thought reasoning framework for chat-based LLMs (e.g., ChatGPT). In ChatCoT, we model the chain-of-thought (CoT) reasoning as multi-turn conversations, to utilize tools in a more natural way through chatting. At each turn, LLMs can either interact with tools or perform the reasoning. Our approach can effectively leverage the multi-turn conversation ability of chat-based LLMs, and integrate the thought chain following and tools manipulation in a unified way. Specially, we initialize the early turns of the conversation by the knowledge about tools, tasks, and reasoning format, and propose an iterative tool-augmented reasoning step to perform step-by-step tool-augmented reasoning. The experiment results on two complex reasoning datasets (MATH and HotpotQA) have shown the effectiveness of ChatCoT on complex reasoning tasks, achieving a 7.9% relative improvement over the state-of-the-art baseline. Our code and data are available at: \url{https://github.com/RUCAIBOX/ChatCoT}.
翻译:尽管大规模语言模型(LLMs)在多种评估基准中取得了优异性能,但在需要特定知识与多跳推理的复杂推理任务中仍存在不足。为提升推理能力,我们提出ChatCoT——一种面向聊天式LLMs(如ChatGPT)的工具增强思维链推理框架。在ChatCoT中,我们将思维链(CoT)推理建模为多轮对话,通过聊天方式更自然地使用工具。每轮对话中,LLMs可与工具交互或执行推理。该方法能有效利用聊天式LLMs的多轮对话能力,将思维链遵循与工具操作统一整合。具体而言,我们通过工具、任务及推理格式的相关知识初始化对话早期轮次,并提出迭代式工具增强推理步骤,实现逐步的工具辅助推理。在两个复杂推理数据集(MATH与HotpotQA)上的实验结果表明,ChatCoT在复杂推理任务中具有有效性,相较于最先进基线实现了7.9%的相对性能提升。我们的代码与数据已开源至:\url{https://github.com/RUCAIBOX/ChatCoT}。