When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.
翻译:当人类程序员掌握一种编程语言后,学习新编程语言会更容易。本报告聚焦探索在代码大语言模型的指令微调阶段,编程语言之间能否相互促进。我们在StarCoder上对8种主流编程语言(Python、JavaScript、TypeScript、C、C++、Java、Go、HTML)进行了广泛实验。结果表明,编程语言之间能够显著相互提升。例如,基于Python训练的CodeM-Python 15B在HumanEval-X上将Java的pass@1指标绝对提升了17.95%。更令人惊讶的是,我们发现基于HTML语料训练的CodeM-HTML 7B可将Java的pass@1绝对提升15.24%。我们的训练数据已在https://github.com/NL2Code/CodeM开源。