LLMs in Web Development: Evaluating LLM-Generated PHP Code Unveiling Vulnerabilities and Limitations

This study evaluates the security of web application code generated by Large Language Models, analyzing 2,500 GPT-4 generated PHP websites. These were deployed in Docker containers and tested for vulnerabilities using a hybrid approach of Burp Suite active scanning, static analysis, and manual review. Our investigation focuses on identifying Insecure File Upload, SQL Injection, Stored XSS, and Reflected XSS in GPT-4 generated PHP code. This analysis highlights potential security risks and the implications of deploying such code in real-world scenarios. Overall, our analysis found 2,440 vulnerable parameters. According to Burp's Scan, 11.56% of the sites can be straight out compromised. Adding static scan results, 26% had at least one vulnerability that can be exploited through web interaction. Certain coding scenarios, like file upload functionality, are insecure 78% of the time, underscoring significant risks to software safety and security. To support further research, we have made the source codes and a detailed vulnerability record for each sample publicly available. This study emphasizes the crucial need for thorough testing and evaluation if generative AI technologies are used in software development.

翻译：本研究评估了大语言模型生成的Web应用程序代码的安全性，分析了2500个由GPT-4生成的PHP网站。这些网站被部署在Docker容器中，并采用Burp Suite主动扫描、静态分析和人工审查相结合的混合方法进行漏洞测试。我们重点调查了GPT-4生成的PHP代码中存在的文件上传不安全、SQL注入、存储型XSS和反射型XSS漏洞。该分析凸显了潜在安全风险，以及将这些代码部署到现实场景中所带来的影响。总体而言，我们的分析发现了2440个易受攻击的参数。根据Burp的扫描结果，11.56%的网站可直接被攻破。若结合静态扫描结果，26%的网站至少存在一个可通过网络交互利用的漏洞。某些编码场景（例如文件上传功能）在78%的情况下存在安全隐患，凸显了其对软件安全性的重大威胁。为支持进一步研究，我们已公开所有源代码及每个样本的详细漏洞记录。本研究强调，若将生成式人工智能技术应用于软件开发，必须进行彻底的测试与评估。

相关内容

PHP

关注 296

PHP 是英文超级文本预处理语言（PHP：Hypertext Preprocessor）的缩写。PHP 是一种 HTML 内嵌式的语言，是一种在服务器端执行的嵌入 HTML 文档的脚本语言，语言的风格有类似于 C 语言，被广泛的运用。PHP 具有非常强大的功能，所有的 CGI 的功能 PHP 都能实现，而且支持几乎所有流行的数据库以及操作系统。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日