Recent advancements in automatic code generation using large language models (LLMs) have brought us closer to fully automated secure software development. However, existing approaches often rely on a single agent for code generation, which struggles to produce secure, vulnerability-free code. Traditional program synthesis with LLMs has primarily focused on functional correctness, often neglecting critical dynamic security implications that happen during runtime. To address these challenges, we propose AutoSafeCoder, a multi-agent framework that leverages LLM-driven agents for code generation, vulnerability analysis, and security enhancement through continuous collaboration. The framework consists of three agents: a Coding Agent responsible for code generation, a Static Analyzer Agent identifying vulnerabilities, and a Fuzzing Agent performing dynamic testing using a mutation-based fuzzing approach to detect runtime errors. Our contribution focuses on ensuring the safety of multi-agent code generation by integrating dynamic and static testing in an iterative process during code generation by LLM that improves security. Experiments using the SecurityEval dataset demonstrate a 13% reduction in code vulnerabilities compared to baseline LLMs, with no compromise in functionality.
翻译:近年来,利用大语言模型(LLMs)进行自动代码生成的技术进步使我们更接近完全自动化的安全软件开发。然而,现有方法通常依赖单一智能体进行代码生成,难以产生安全、无漏洞的代码。传统的基于LLM的程序合成主要关注功能正确性,往往忽视了运行时可能发生的关键动态安全影响。为应对这些挑战,我们提出了AutoSafeCoder,这是一个多智能体框架,它利用LLM驱动的智能体通过持续协作进行代码生成、漏洞分析和安全增强。该框架包含三个智能体:负责代码生成的编码智能体、识别漏洞的静态分析智能体,以及采用基于变异的模糊测试方法进行动态测试以检测运行时错误的模糊测试智能体。我们的贡献在于,通过在LLM代码生成过程中集成动态与静态测试的迭代流程来确保多智能体代码生成的安全性,从而提升安全性。基于SecurityEval数据集的实验表明,与基线LLMs相比,代码漏洞减少了13%,且功能未受影响。