Static analysis tools provide a powerful means to detect security vulnerabilities by specifying queries that encode vulnerable code patterns. However, writing such queries is challenging and requires diverse expertise in security and program analysis. To address this challenge, we present QLCoder - an agentic framework that automatically synthesizes queries in CodeQL, a powerful static analysis engine, directly from a given CVE metadata. QLCode embeds an LLM in a synthesis loop with execution feedback, while constraining its reasoning using a custom MCP interface that allows structured interaction with a Language Server Protocol (for syntax guidance) and a RAG database (for semantic retrieval of queries and documentation). This approach allows QLCoder to generate syntactically and semantically valid security queries. We evaluate QLCode on 176 existing CVEs across 111 Java projects. Building upon the Claude Code agent framework, QLCoder synthesizes correct queries that detect the CVE in the vulnerable but not in the patched versions for 53.4% of CVEs. In comparison, using only Claude Code synthesizes 10% correct queries. QLCoder code is available publicly at https://github.com/neuralprogram/QLCoder.
翻译:静态分析工具通过指定编码脆弱代码模式的查询,为检测安全漏洞提供了强大手段。然而,编写此类查询极具挑战性,需要兼具安全与程序分析领域的多元专业知识。为解决这一难题,我们提出QLCoder——一个智能体框架,能够直接从给定的CVE元数据自动合成CodeQL(一种强大的静态分析引擎)查询。QLCoder将大语言模型(LLM)嵌入到包含执行反馈的合成循环中,并通过自定义MCP接口约束其推理过程,该接口支持与语言服务器协议(用于语法指导)和RAG数据库(用于查询与文档的语义检索)的结构化交互。该方法使得QLCoder能够生成语法与语义均有效的安全查询。我们在涵盖111个Java项目的176个现有CVE上评估了QLCoder的性能。基于Claude Code智能体框架,QLCoder成功为53.4%的CVE合成了正确查询,这些查询能在漏洞版本中检测到CVE,而在补丁版本中则不会。相比之下,仅使用Claude Code的正确查询合成率为10%。QLCoder代码已在https://github.com/neuralprogram/QLCoder上公开。