Detecting defects and vulnerabilities in the early stage has long been a challenge in software engineering. Static analysis, a technique that inspects code without execution, has emerged as a key strategy to address this challenge. Among recent advancements, the use of graph-based representations, particularly Code Property Graph (CPG), has gained traction due to its comprehensive depiction of code structure and semantics. Despite the progress, existing graph-based analysis tools still face performance and scalability issues. The main bottleneck lies in the size and complexity of CPG, which makes analyzing large codebases inefficient and memory-consuming. Also, query rules used by the current tools can be over-specific. Hence, we introduce QVoG, a graph-based static analysis platform for detecting defects and vulnerabilities. It employs a compressed CPG representation to maintain a reasonable graph size, thereby enhancing the overall query efficiency. Based on the CPG, it also offers a declarative query language to simplify the queries. Furthermore, it takes a step forward to integrate machine learning to enhance the generality of vulnerability detection. For projects consisting of 1,000,000+ lines of code, QVoG can complete analysis in approximately 15 minutes, as opposed to 19 minutes with CodeQL.
翻译:在软件工程领域,早期检测缺陷与漏洞始终是一项长期挑战。静态分析作为一种无需执行代码即可进行检测的技术,已成为应对该挑战的关键策略。近年来,基于图的表示方法,特别是代码属性图(CPG)因其对代码结构与语义的全面刻画而备受关注。然而,现有基于图的工具仍面临性能与可扩展性问题。其主要瓶颈在于CPG的规模与复杂性,导致分析大型代码库时效率低下且内存消耗过高,同时当前工具采用的查询规则可能过于特定。为此,我们提出QVoG——一种基于图的静态分析平台,专用于缺陷与漏洞检测。该平台采用压缩CPG表示以维持合理图规模,从而提升整体查询效率;基于CPG提供声明式查询语言以简化查询操作;更进一步引入机器学习技术增强漏洞检测的泛化能力。对于包含100万行以上代码的项目,QVoG可在约15分钟内完成分析,而CodeQL需要19分钟。