Loop vulnerabilities are one major risky construct in software development. They can easily lead to infinite loops or executions, exhaust resources, or introduce logical errors that degrade performance and compromise security. The problem are often undetected by traditional static analyzers because such tools rely on syntactic patterns, which makes them struggle to detect semantic flaws. Consequently, Large Language Models (LLMs) offer new potential for vulnerability detection because of their ability to understand code contextually. Moreover, local LLMs unlike commercial ones like ChatGPT or Gemini addresses issues such as privacy, latency, and dependency concerns by facilitating efficient offline analysis. Consequently, this study proposes a prompt-based framework that utilize local LLMs for the detection of loop vulnerabilities within Python 3.7+ code. The framework targets three categories of loop-related issues, such as control and logic errors, security risks inside loops, and resource management inefficiencies. A generalized and structured prompt-based framework was designed and tested with two locally deployed LLMs (LLaMA 3.2; 3B and Phi 3.5; 4B) by guiding their behavior via iterative prompting. The designed prompt-based framework included key safeguarding features such as language-specific awareness, code-aware grounding, version sensitivity, and hallucination prevention. The LLM results were validated against a manually established baseline truth, and the results indicate that Phi outperforms LLaMA in precision, recall, and F1-score. The findings emphasize the importance of designing effective prompts for local LLMs to perform secure and accurate code vulnerability analysis.
翻译:循环漏洞是软件开发中的主要风险结构之一。它们极易导致无限循环或执行、耗尽资源,或引入逻辑错误,从而降低性能并危及安全性。传统静态分析工具常无法检测此类问题,因为这些工具依赖语法模式,难以识别语义缺陷。因此,大型语言模型(LLMs)凭借其理解代码上下文的能力,为漏洞检测提供了新的潜力。此外,与ChatGPT或Gemini等商业模型不同,本地LLMs通过支持高效离线分析,解决了隐私、延迟和依赖性等问题。为此,本研究提出了一种基于提示的框架,利用本地LLMs检测Python 3.7+代码中的循环漏洞。该框架针对三类循环相关问题:控制与逻辑错误、循环内的安全风险以及资源管理低效。通过迭代式提示引导,设计并测试了一个通用且结构化的提示框架,并在两个本地部署的LLM(LLaMA 3.2; 3B 和 Phi 3.5; 4B)上进行了验证。所设计的提示框架包含关键保障特性,如语言特定感知、代码感知锚定、版本敏感性和幻觉预防。LLM的输出结果以人工建立的基准真值为标准进行验证,结果表明Phi在精确率、召回率和F1分数上均优于LLaMA。研究结果强调了为本地LLMs设计有效提示以实现安全、准确的代码漏洞分析的重要性。