With the great success of ChatGPT, the research of large language models has become increasingly popular. However, the models have several limitations, such as toxicity and pool performance of arithmetic solving. Meanwhile, LLM may have some potential abilities that have yet to be exploited. In this paper, we choose a different way to enhance the arithmetic ability of LLM. We propose to train LLM to generate a postfix expression related to the arithmetic problem and incorporate it with small pretrained models. Moreover, this small model transfers the token embeddings into real dense numbers and invokes native functions of a deep learning platform to get the correct answer. To generate the final result, we propose prompt injection for adding the result outputs by the small model to LLM. This work provides different ways of thinking, training and using a language model. The codes and models will be released at \url{https://github.com/eteced/arithmetic_finetuning_v1}.
翻译:随着ChatGPT的巨大成功,大语言模型的研究日益普及。然而,这类模型存在若干局限性,例如输出有毒内容以及算术求解能力薄弱。与此同时,大语言模型可能拥有一些尚未被挖掘的潜在能力。本文中,我们选择了一条不同路径来增强大语言模型的算术能力。我们提出训练大语言模型生成与算术问题相关的后缀表达式,并将其与小型预训练模型相结合。此外,该小型模型将词元嵌入转化为实值密集数值,并调用深度学习平台的原生函数以获取正确答案。为生成最终结果,我们提出提示注入方法,将小型模型输出的结果注入到大语言模型中。本研究为语言模型的思维、训练与使用提供了不同思路。相关代码与模型将在\url{https://github.com/eteced/arithmetic_finetuning_v1}开源。