Numerical reasoning is vital for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions and then generate answers. Current SOTA methods generate programs as IMRs with large language models (LLMs). Intuitively, equations have fewer restrictions and closer semantics to the question than programs, leading to higher generation accuracy. However, current LLMs generate equations worse than programs, where we assume that the equation data is rare in pre-training data compared to programs. So in this paper, we try to use equations as IMRs to solve the numerical reasoning task by addressing two problems: (1) Theoretically, how to prove that the equation is an IMR with higher generation accuracy than programs; (2) Empirically, how to improve the generation accuracy of equations with LLMs. For the first problem, we propose and prove a proposition to theoretically compare the generation accuracy of different IMRs. For the second problem, we present a method called Boosting Numerical Reason\textbfing by Decomposing the Generation of Equations (Bridge), which can improve the accuracy of LLMs in generating equations as IMRs by reducing the tendency of generating constant expressions and programs. Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets compared to the previous state-of-the-art methods under the single reasoning path setting. Our codes and prompts are released in https://github.com/zirui-HIT/Bridge_for_Numerical_Reasoning.
翻译:数推理对于自然语言处理模型理解并处理真实场景中的数值信息至关重要。当前多数方法首先生成问题的中间语义表征(IMR),继而生成答案。现有最先进方法利用大语言模型(LLM)生成程序作为IMR。直观而言,方程具有更少约束且比程序更贴近问题的语义,故理论上有望实现更高生成准确率。然而,当前LLM生成方程的效果劣于程序,我们推测这是由于预训练数据中方程数据相较程序更为稀缺。为此,本文尝试采用方程作为IMR解决数值推理任务,并攻克两大问题:(1)理论上,如何证明方程作为IMR能比程序获得更高生成准确率;(2)实证上,如何提升LLM生成方程的准确率。针对问题一,我们提出并证明了一个命题以理论化比较不同IMR的生成准确率。针对问题二,我们提出一种名为“通过分解方程生成增强数推理”(Bridge)的方法,该方法通过降低生成常量表达式与程序的倾向,提升LLM将方程作为IMR的生成准确率。在单推理路径设置下,我们的方法在GSM8K、SVAMP和Algebra数据集上相较先前最先进方法分别提升2.2%、0.9%和1.7%的性能。代码及提示词已发布于https://github.com/zirui-HIT/Bridge_for_Numerical_Reasoning。