Program-of-Thought (PoT), which aims to use programming language instead of natural language as an intermediate step in reasoning, is an important way for LLMs to solve mathematical problems. Since different programming languages excel in different areas, it is natural to use the most suitable language for solving specific problems. However, current PoT research only focuses on single language PoT, ignoring the differences between different programming languages. Therefore, this paper proposes an multilingual program reasoning method, MultiLingPoT. This method allows the model to answer questions using multiple programming languages by fine-tuning on multilingual data. Additionally, prior and posterior hybrid methods are used to help the model select the most suitable language for each problem. Our experimental results show that the training of MultiLingPoT improves each program's mathematical reasoning by about 2.5\%. Moreover, with proper mixing, the performance of MultiLingPoT can be further improved, achieving a 6\% increase compared to the single-language PoT with the data augmentation.Resources of this paper can be found at https://github.com/Nianqi-Li/MultiLingPoT.
翻译:程序思维链旨在使用编程语言而非自然语言作为推理的中间步骤,是大型语言模型解决数学问题的重要方法。由于不同编程语言在不同领域各有所长,自然应使用最适合的语言来解决特定问题。然而,当前的程序思维链研究仅聚焦于单一语言程序思维链,忽略了不同编程语言之间的差异。为此,本文提出了一种多语言程序推理方法——MultiLingPoT。该方法通过对多语言数据进行微调,使模型能够使用多种编程语言回答问题。此外,本文采用先验与后验混合策略,帮助模型为每个问题选择最合适的编程语言。实验结果表明,MultiLingPoT的训练使各编程语言的数学推理能力提升约2.5%。通过恰当的混合策略,MultiLingPoT的性能可进一步提升,在数据增强条件下较单语言程序思维链实现了6%的性能增益。本文相关资源可在 https://github.com/Nianqi-Li/MultiLingPoT 获取。