In recent years, JavaScript has become the most widely used programming language, especially in web development. However, writing secure JavaScript code is not trivial, and programmers often make mistakes that lead to security vulnerabilities in web applications. Large Language Models (LLMs) have demonstrated substantial advancements across multiple domains, and their evolving capabilities indicate their potential for automatic code generation based on a required specification, including automatic bug fixing. In this study, we explore the accuracy of LLMs, namely ChatGPT and Bard, in finding and fixing security vulnerabilities in JavaScript programs. We also investigate the impact of context in a prompt on directing LLMs to produce a correct patch of vulnerable JavaScript code. Our experiments on real-world software vulnerabilities show that while LLMs are promising in automatic program repair of JavaScript code, achieving a correct bug fix often requires an appropriate amount of context in the prompt.
翻译:近年来,JavaScript已成为最广泛使用的编程语言,尤其在Web开发领域。然而,编写安全的JavaScript代码并非易事,程序员常犯的错误会导致Web应用程序出现安全漏洞。大语言模型在多个领域展现出显著的进步,其不断发展的能力表明它们能够根据特定规范自动生成代码,包括自动修复缺陷。本研究探索了大语言模型(即ChatGPT和Bard)在发现和修复JavaScript程序安全漏洞方面的准确性,同时考察提示中的上下文对引导大语言模型生成正确漏洞补丁的影响。基于真实软件漏洞的实验表明,尽管大语言模型在JavaScript代码的自动程序修复方面具有潜力,但实现正确缺陷修复通常需要提示中包含适当数量的上下文信息。