SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models

We introduce SkipAnalyzer, the first large language model (LLM)-powered embodied agent for static code analysis. It can detect bugs, filter false positive warnings, and patch the detected bugs without human intervention. SkipAnalyzer consists of three components, 1) an LLM-based static bug detector that scans source code and reports specific types of bugs, 2) an LLM-based false-positive filter that can identify false-positive bugs in the results of static bug detectors to improve detection accuracy, and 3) an LLM-based patch generator that can generate patches for the detected bugs above. As a proof-of-concept, SkipAnalyzer is built on ChatGPT, which has exhibited outstanding performance in various software engineering tasks. To evaluate SkipAnalyzer, we focus on two types of typical and critical bugs that are targeted by static bug detection, i.e., Null Dereference and Resource Leak as subjects. We employ Infer to aid the gathering of these two bug types from 10 open-source projects. Consequently, our experiment dataset contains 222 instances of Null Dereference bugs and 46 instances of Resource Leak bugs. Our study demonstrates that SkipAnalyzer achieves remarkable performance in the mentioned static analysis tasks, including bug detection, false-positive warning removal, and bug repair. In static bug detection, SkipAnalyzer achieves accuracy values of up to 68.37% for detecting Null Dereference bugs and 76.95% for detecting Resource Leak bugs, outperforming the current leading bug detector, Infer. For removing false-positive warnings, SkipAnalyzer can reach a precision of up to 93.88% for Null Dereference bugs and 63.33% for Resource Leak bugs. Additionally, SkipAnalyzer surpasses state-of-the-art false-positive warning removal tools. Furthermore, in bug repair, SkipAnalyzer can generate syntactically correct patches to fix its detected bugs with a success rate of up to 97.30%.

翻译：我们提出SkipAnalyzer——首个基于大语言模型（LLM）的具身代理，用于实现静态代码分析。该代理能够自主检测漏洞、过滤误报并修复所发现的漏洞，全程无需人工干预。SkipAnalyzer由三个组件构成：1）基于LLM的静态漏洞检测器，可扫描源码并报告特定类型漏洞；2）基于LLM的误报过滤器，能识别静态漏洞检测结果中的误报，从而提升检测精度；3）基于LLM的补丁生成器，可为上述检测到的漏洞生成修复补丁。作为概念验证，SkipAnalyzer基于ChatGPT构建，该模型已在多种软件工程任务中展现出卓越性能。为评估SkipAnalyzer，我们聚焦静态漏洞检测所针对的两类典型高危漏洞——空指针解引用与资源泄漏。借助Infer工具，我们从10个开源项目中收集了这两类漏洞样本。实验数据集包含222个空指针解引用实例与46个资源泄漏实例。研究表明，SkipAnalyzer在所述静态分析任务（漏洞检测、误报过滤、漏洞修复）中均表现优异。在静态漏洞检测中，SkipAnalyzer对空指针解引用和资源泄漏的检测准确率分别达到68.37%和76.95%，优于当前领先的漏洞检测器Infer；在误报过滤方面，其对空指针解引用和资源泄漏的精确率分别高达93.88%和63.33%，超越了现有最先进的误报过滤工具；此外在漏洞修复中，SkipAnalyzer可生成语法正确的补丁，修复成功率达97.30%。