Code generation tools driven by artificial intelligence have recently become more popular due to advancements in deep learning and natural language processing that have increased their capabilities. The proliferation of these tools may be a double-edged sword because while they can increase developer productivity by making it easier to write code, research has shown that they can also generate insecure code. In this paper, we perform a user-centered evaluation GitHub's Copilot to better understand its strengths and weaknesses with respect to code security. We conduct a user study where participants solve programming problems (with and without Copilot assistance) that have potentially vulnerable solutions. The main goal of the user study is to determine how the use of Copilot affects participants' security performance. In our set of participants (n=25), we find that access to Copilot accompanies a more secure solution when tackling harder problems. For the easier problem, we observe no effect of Copilot access on the security of solutions. We also observe no disproportionate impact of Copilot use on particular kinds of vulnerabilities. Our results indicate that there are potential security benefits to using Copilot, but more research is warranted on the effects of the use of code generation tools on technically complex problems with security requirements.
翻译:近年来,受深度学习与自然语言处理技术进步的推动,人工智能驱动的代码生成工具因其能力提升而日益普及。这类工具的普及可能是一把双刃剑:虽然它们通过简化代码编写流程能提升开发者工作效率,但研究表明它们也可能生成不安全的代码。本文通过用户中心的评估方法,探究GitHub Copilot在代码安全方面的优势与不足。我们开展了一项用户研究,要求参与者解决存在潜在漏洞的编程问题(分别在有/无Copilot辅助条件下进行)。该用户研究的主要目标是分析Copilot的使用如何影响参与者的安全表现。在25名参与者样本中,我们发现:当解决较复杂问题时,使用Copilot的参与者提交的代码更具安全性;而对于较简单问题,是否使用Copilot对解决方案的安全性无显著影响。此外,我们未观察到Copilot的使用对特定类型漏洞产生不成比例的影响。研究结果表明,使用Copilot可能带来安全收益,但关于代码生成工具在具有安全需求的技术复杂问题中的影响,仍需开展更深入的研究。