We conduct the first large-scale user study examining how users interact with an AI Code assistant to solve a variety of security related tasks across different programming languages. Overall, we find that participants who had access to an AI assistant based on OpenAI's codex-davinci-002 model wrote significantly less secure code than those without access. Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant. Furthermore, we find that participants who trusted the AI less and engaged more with the language and format of their prompts (e.g. re-phrasing, adjusting temperature) provided code with fewer security vulnerabilities. Finally, in order to better inform the design of future AI-based Code assistants, we provide an in-depth analysis of participants' language and interaction behavior, as well as release our user interface as an instrument to conduct similar studies in the future.
翻译:我们开展了首次大规模用户研究,考察用户如何与AI代码助手互动,以解决不同编程语言中的多种安全相关任务。总体而言,我们发现能够访问基于OpenAI Codex-davinci-002模型的AI助手的参与者编写的代码,安全性显著低于无法访问该助手的参与者。此外,拥有AI助手的参与者更倾向于相信自己编写了安全代码,而无法访问AI助手的参与者则相反。进一步研究表明,对AI信任度较低、更注重提示语言及格式(例如重新措辞、调整温度参数)的参与者,其代码中的安全漏洞更少。最后,为更好地指导未来AI代码助手的设计,我们对参与者的语言及交互行为进行了深入分析,并公开了我们的用户界面,作为未来开展类似研究的工具。