Vulnerability prediction is valuable in identifying security issues more efficiently, even though it requires the source code of the target software system, which is a restrictive hypothesis. This paper presents an experimental study to predict vulnerabilities in binary code without source code or complex representations of the binary, leveraging the pivotal idea of decompiling the binary file through neural decompilation and predicting vulnerabilities through deep learning on the decompiled source code. The results outperform the state-of-the-art in both neural decompilation and vulnerability prediction, showing that it is possible to identify vulnerable programs with this approach concerning bi-class (vulnerable/non-vulnerable) and multi-class (type of vulnerability) analysis.
翻译:漏洞预测对于更高效地识别安全问题具有重要价值,尽管其通常需要目标软件系统的源代码,这一前提条件具有较大限制性。本文提出一项实验研究,旨在无需源代码或复杂的二进制表示的情况下预测二进制代码中的漏洞,其核心思想是通过神经网络反编译技术将二进制文件反编译为源代码,并基于反编译得到的源代码进行深度学习漏洞预测。实验结果表明,该方法在神经网络反编译与漏洞预测两方面均优于现有最优技术,证明通过二分类(存在漏洞/无漏洞)与多分类(漏洞类型)分析,该途径能够有效识别存在漏洞的程序。