Large Language Models (LLMs) have shown impressive proficiency in code generation. Nonetheless, similar to human developers, these models might generate code that contains security vulnerabilities and flaws. Writing secure code remains a substantial challenge, as vulnerabilities often arise during interactions between programs and external systems or services, such as databases and operating systems. In this paper, we propose a novel approach, Feedback-Driven Solution Synthesis (FDSS), designed to explore the use of LLMs in receiving feedback from Bandit, which is a static code analysis tool, and then the LLMs generate potential solutions to resolve security vulnerabilities. Each solution, along with the vulnerable code, is then sent back to the LLM for code refinement. Our approach shows a significant improvement over the baseline and outperforms existing approaches. Furthermore, we introduce a new dataset, PythonSecurityEval, collected from real-world scenarios on Stack Overflow to evaluate the LLMs' ability to generate secure code. Code and data are available at \url{https://github.com/Kamel773/LLM-code-refine}
翻译:大型语言模型(LLMs)在代码生成方面已展现出卓越能力。然而,与人类开发者类似,这些模型也可能生成包含安全漏洞和缺陷的代码。编写安全代码仍是一项重大挑战,因为漏洞往往源于程序与数据库、操作系统等外部系统或服务交互的过程中。本文提出一种新方法——反馈驱动解决方案合成(FDSS),旨在探索LLMs接收静态代码分析工具Bandit反馈后,生成消除安全漏洞潜在解决方案的能力。每个解决方案与脆弱代码一同返回LLMs进行代码优化。本方法相比基线模型有显著提升,并优于现有方法。此外,我们引入新数据集PythonSecurityEval,该数据集基于Stack Overflow真实场景收集,用于评估LLMs生成安全代码的能力。代码与数据可在\url{https://github.com/Kamel773/LLM-code-refine}获取。