Artificial Intelligence (AI)-driven code generation tools are increasingly used throughout the software development lifecycle to accelerate coding tasks. However, the security of AI-generated code using Large Language Models (LLMs) remains underexplored, with studies revealing various risks and weaknesses. This paper analyzes the security of code generated by LLMs across different programming languages. We introduce a dataset of 200 tasks grouped into six categories to evaluate the performance of LLMs in generating secure and maintainable code. Our research shows that while LLMs can automate code creation, their security effectiveness varies by language. Many models fail to utilize modern security features in recent compiler and toolkit updates, such as Java 17. Moreover, outdated methods are still commonly used, particularly in C++. This highlights the need for advancing LLMs to enhance security and quality while incorporating emerging best practices in programming languages.
翻译:人工智能驱动的代码生成工具正日益广泛地应用于软件开发生命周期中,以加速编码任务。然而,基于大语言模型生成的代码的安全性仍未得到充分探索,已有研究揭示了多种风险与缺陷。本文分析了LLM在不同编程语言中生成代码的安全性。我们构建了一个包含六类共200项任务的评估数据集,用以评估LLM生成安全且可维护代码的性能。研究表明,虽然LLM能够实现代码自动生成,但其安全有效性因语言而异。许多模型未能充分利用现代编译器与工具包(如Java 17)的最新安全特性。此外,过时的方法仍被普遍使用,尤其在C++语言中。这凸显了推进LLM发展以提升安全性与质量的必要性,同时需要融入编程语言领域新兴的最佳实践。