When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.
翻译:当人类程序员掌握了一门编程语言后,学习新编程语言会变得更容易。本报告聚焦探究在代码大语言模型的指令微调阶段,编程语言之间能否相互促进。我们在StarCoder上对8种主流编程语言(Python、JavaScript、TypeScript、C、C++、Java、Go、HTML)开展了大量实验。结果表明,编程语言能够显著相互提升。例如,基于Python训练的CodeM-Python 15B模型在HumanEval-X基准测试中,能使Java语言的pass@1指标绝对值提升17.95%。更令人惊讶的是,我们发现基于HTML语料训练的CodeM-HTML 7B模型,能使Java语言的pass@1指标绝对值提升15.24%。我们的训练数据已发布于https://github.com/NL2Code/CodeM。