Static analysis tools provide a powerful means to detect security vulnerabilities by specifying queries that encode vulnerable code patterns. However, writing such queries is challenging and requires diverse expertise in security and program analysis. To address this challenge, we present QLCoder - an agentic framework that automatically synthesizes queries in CodeQL, a powerful static analysis engine, directly from a given CVE metadata. QLCode embeds an LLM in a synthesis loop with execution feedback, while constraining its reasoning using a custom MCP interface that allows structured interaction with a Language Server Protocol (for syntax guidance) and a RAG database (for semantic retrieval of queries and documentation). This approach allows QLCoder to generate syntactically and semantically valid security queries. We evaluate QLCode on 176 existing CVEs across 111 Java projects. Building upon the Claude Code agent framework, QLCoder synthesizes correct queries that detect the CVE in the vulnerable but not in the patched versions for 53.4% of CVEs. In comparison, using only Claude Code synthesizes 10% correct queries.
翻译:静态分析工具通过指定编码脆弱代码模式的查询,为检测安全漏洞提供了强大手段。然而,编写此类查询极具挑战性,需要具备安全和程序分析领域的多样化专业知识。为应对这一挑战,我们提出QLCoder——一种智能体框架,能够根据给定的CVE元数据,在CodeQL(一款强大的静态分析引擎)中自动合成查询。QLCoder将大语言模型嵌入带有执行反馈的合成循环中,同时利用定制的MCP接口约束其推理过程,该接口支持与语言服务器协议(用于语法指导)和RAG数据库(用于查询与文档的语义检索)进行结构化交互。这种方法使QLCoder能够生成语法和语义均有效的安全查询。我们在涵盖111个Java项目的176个现有CVE上评估了QLCoder。基于Claude Code智能体框架,QLCoder合成了可正确检测CVE(在存在漏洞版本而非已修补版本中)的查询,成功率达到53.4%。作为对比,仅使用Claude Code合成的正确查询比例为10%。