During software development, vulnerabilities have posed a significant threat to users. Patches are the most effective way to combat vulnerabilities. In a large-scale software system, testing the presence of a security patch in every affected binary is crucial to ensure system security. Identifying whether a binary has been patched for a known vulnerability is challenging, as there may only be small differences between patched and vulnerable versions. Existing approaches mainly focus on detecting patches that are compiled in the same compiler options. However, it is common for developers to compile programs with very different compiler options in different situations, which causes inaccuracy for existing methods. In this paper, we propose a new approach named PS3, referring to precise patch presence test based on semantic-level symbolic signature. PS3 exploits symbolic emulation to extract signatures that are stable under different compiler options. Then PS3 can precisely test the presence of the patch by comparing the signatures between the reference and the target at semantic level. To evaluate the effectiveness of our approach, we constructed a dataset consisting of 3,631 (CVE, binary) pairs of 62 recent CVEs in four C/C++ projects. The experimental results show that PS3 achieves scores of 0.82, 0.97, and 0.89 in terms of precision, recall, and F1 score, respectively. PS3 outperforms the state-of-the-art baselines by improving 33% in terms of F1 score and remains stable in different compiler options.
翻译:在软件开发过程中,漏洞对用户构成了重大威胁,而补丁是对抗漏洞最有效的方式。在大规模软件系统中,检测每个受影响二进制文件中安全补丁的存在性对于确保系统安全至关重要。由于已修补版本与易受攻击版本之间可能仅存在微小差异,识别二进制文件是否已针对已知漏洞进行修补极具挑战性。现有方法主要侧重于检测在相同编译器选项下编译的补丁,但开发人员在不同场景下通常会使用差异极大的编译器选项编译程序,这导致了现有方法的不准确性。本文提出了一种名为PS3的新方法,即基于语义级符号签名的精确补丁存在性检测。PS3利用符号模拟提取在不同编译器选项下保持稳定的签名,进而通过语义层面比较参考文件与目标文件之间的签名,精确检测补丁存在性。为评估方法有效性,我们构建了包含四个C/C++项目中62个近期CVE漏洞的3,631个(CVE,二进制文件)对数据集。实验结果表明,PS3在精确率、召回率和F1分数上分别达到0.82、0.97和0.89,其F1分数较现有最先进基线方法提升33%,且在不同编译器选项下保持稳定。