While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric. This creates a barrier for program developers who are not proficient in English. To mitigate this gap in technology development across languages, we propose a multilingual dataset, MCoNaLa, to benchmark code generation from natural language commands extending beyond English. Modeled off of the methodology from the English Code/Natural Language Challenge (CoNaLa) dataset, we annotated a total of 896 NL-code pairs in three languages: Spanish, Japanese, and Russian. We present a quantitative evaluation of performance on the MCoNaLa dataset by testing with state-of-the-art code generation systems. While the difficulties vary across these three languages, all systems lag significantly behind their English counterparts, revealing the challenges in adapting code generation to new languages.
翻译:虽然自然语言与编程语言交叉领域的研究近期蓬勃发展,例如代码生成与代码摘要等应用,但这些研究通常以英语为中心。这给非英语母语的程序开发者造成了障碍。为弥合跨语言技术发展的差距,我们提出了多语言数据集MCoNaLa,用于衡量超越英语的自然语言命令到代码生成的性能。该数据集沿用了英语代码/自然语言挑战(CoNaLa)数据集的方法论,在西班牙语、日语和俄语三种语言中标注了896个自然语言-代码对。通过使用最先进的代码生成系统进行测试,我们对MCoNaLa数据集上的性能进行了量化评估。尽管三种语言的困难程度各异,但所有系统的性能都显著落后于其英语对应版本,揭示了将代码生成适配至新语言所面临的挑战。