Large language models (LLMs) have achieved impressive performance on various reasoning tasks. To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. We apply MultiTool-CoT to the Task 2 dataset of NumGLUE, which requires both numerical reasoning and domain-specific knowledge. The experiments show that our method significantly outperforms strong baselines and achieves state-of-the-art performance.
翻译:大型语言模型(LLMs)已在各类推理任务上取得了令人瞩目的性能。为进一步提升性能,我们提出MultiTool-CoT,一种新颖的框架,该框架利用思维链(CoT)提示在推理过程中整合多个外部工具(如计算器和知识检索器)。我们将MultiTool-CoT应用于NumGLUE数据集的任务2,该任务同时需要数值推理和领域特定知识。实验表明,我们的方法显著优于强基线模型,并达到了当前最优性能。