GitHub Copilot is an AI-enabled tool that automates program synthesis. It has gained significant attention since its launch in 2021. Recent studies have extensively examined Copilot's capabilities in various programming tasks, as well as its security issues. However, little is known about the effect of different natural languages on code suggestion. Natural language is considered a social bias in the field of NLP, and this bias could impact the diversity of software engineering. To address this gap, we conducted an empirical study to investigate the effect of three popular natural languages (English, Japanese, and Chinese) on Copilot. We used 756 questions of varying difficulty levels from AtCoder contests for evaluation purposes. The results highlight that the capability varies across natural languages, with Chinese achieving the worst performance. Furthermore, regardless of the type of natural language, the performance decreases significantly as the difficulty of questions increases. Our work represents the initial step in comprehending the significance of natural languages in Copilot's capability and introduces promising opportunities for future endeavors.
翻译:GitHub Copilot是一种能够自动生成程序的人工智能工具,自2021年发布以来备受关注。近期研究广泛考察了Copilot在各类编程任务中的能力及其安全性问题,但关于不同自然语言对代码建议的影响仍知之甚少。自然语言在自然语言处理领域被视为一种社会偏见,这种偏见可能影响软件工程的多样性。为填补这一空白,我们开展了一项实证研究,探究三种常用自然语言(英语、日语和中文)对Copilot的影响。研究采用来自AtCoder竞赛的756道不同难度题目进行评估,结果表明:Copilot的能力因自然语言而异,其中中文表现最差;此外,无论使用何种自然语言,其性能均随问题难度增加显著下降。本工作为理解自然语言对Copilot能力的重要性迈出了初步一步,并为未来研究提供了广阔前景。