Security in code generation remains a pivotal challenge when applying large language models (LLMs). This paper introduces RefleXGen, an innovative method that significantly enhances code security by integrating Retrieval-Augmented Generation (RAG) techniques with guided self-reflection mechanisms inherent in LLMs. Unlike traditional approaches that rely on fine-tuning LLMs or developing specialized secure code datasets - processes that can be resource-intensive - RefleXGen iteratively optimizes the code generation process through self-assessment and reflection without the need for extensive resources. Within this framework, the model continuously accumulates and refines its knowledge base, thereby progressively improving the security of the generated code. Experimental results demonstrate that RefleXGen substantially enhances code security across multiple models, achieving a 13.6% improvement with GPT-3.5 Turbo, a 6.7% improvement with GPT-4o, a 4.5% improvement with CodeQwen, and a 5.8% improvement with Gemini. Our findings highlight that improving the quality of model self-reflection constitutes an effective and practical strategy for strengthening the security of AI-generated code.
翻译:代码生成的安全性仍然是应用大语言模型(LLMs)时的关键挑战。本文介绍了RefleXGen,这是一种创新的方法,通过将检索增强生成(RAG)技术与LLMs固有的引导式自我反思机制相结合,显著提升了代码安全性。与依赖微调LLMs或开发专门的、资源密集型的、安全的代码数据集等传统方法不同,RefleXGen通过自我评估和反思,迭代地优化代码生成过程,而无需大量资源。在此框架内,模型持续积累并精炼其知识库,从而逐步提高生成代码的安全性。实验结果表明,RefleXGen显著提升了多个模型的代码安全性,在GPT-3.5 Turbo上实现了13.6%的改进,在GPT-4o上实现了6.7%的改进,在CodeQwen上实现了4.5%的改进,在Gemini上实现了5.8%的改进。我们的研究结果强调,提升模型自我反思的质量是增强AI生成代码安全性的有效且实用的策略。