Recently, large language models for code generation have achieved breakthroughs in several programming language tasks. Their advances in competition-level programming problems have made them an emerging pillar in AI-assisted pair programming. Tools such as GitHub Copilot are already part of the daily programming workflow and are used by more than a million developers. The training data for these models is usually collected from open-source repositories (e.g., GitHub) that contain software faults and security vulnerabilities. This unsanitized training data can lead language models to learn these vulnerabilities and propagate them in the code generation procedure. Given the wide use of these models in the daily workflow of developers, it is crucial to study the security aspects of these models systematically. In this work, we propose the first approach to automatically finding security vulnerabilities in black-box code generation models. To achieve this, we propose a novel black-box inversion approach based on few-shot prompting. We evaluate the effectiveness of our approach by examining code generation models in the generation of high-risk security weaknesses. We show that our approach automatically and systematically finds 1000s of security vulnerabilities in various code generation models, including the commercial black-box model GitHub Copilot.
翻译:最近,面向代码生成的大型语言模型在多项编程语言任务上取得了突破性进展。它们在竞赛级编程问题上的提升使其成为AI辅助结对编程的新兴支柱。诸如GitHub Copilot等工具已融入日常编程工作流程,被超过百万开发者使用。这些模型的训练数据通常来自包含软件缺陷和安全漏洞的开源仓库(如GitHub)。未经净化的训练数据可能导致语言模型学习这些漏洞,并在代码生成过程中传播。鉴于这些模型在开发者日常工作中的广泛使用,系统性研究其安全特性至关重要。本文提出了首个自动检测黑盒代码生成模型中安全漏洞的方法。为此,我们基于少样本提示提出了一种新型黑盒逆向方法。通过检测代码生成模型生成的高风险安全弱点,我们评估了该方法的有效性。实验表明,本方法能自动、系统地发现包括商业黑盒模型GitHub Copilot在内的多种代码生成模型中的数千个安全漏洞。