Software vulnerabilities pose significant risks to computer systems, impacting our daily lives, productivity, and even our health. Identifying and addressing security vulnerabilities in a timely manner is crucial to prevent hacking and data breaches. Unfortunately, current vulnerability identification methods, including classical and deep learning-based approaches, exhibit critical drawbacks that prevent them from meeting the demands of the contemporary software industry. To tackle these issues, we present JFinder, a novel architecture for Java vulnerability identification that leverages quad self-attention and pre-training mechanisms to combine structural information and semantic representations. Experimental results demonstrate that JFinder outperforms all baseline methods, achieving an accuracy of 0.97 on the CWE dataset and an F1 score of 0.84 on the PROMISE dataset. Furthermore, a case study reveals that JFinder can accurately identify four cases of vulnerabilities after patching.
翻译:软件漏洞对计算机系统构成重大风险,影响着我们的日常生活、生产效率甚至健康安全。及时识别并解决安全漏洞对于防范黑客攻击和数据泄露至关重要。然而,当前包括经典方法和基于深度学习方法在内的漏洞识别技术存在关键缺陷,难以满足现代软件工业的需求。针对这些问题,本文提出JFinder——一种结合四重自注意力与预训练机制的Java漏洞识别新型架构,能够融合结构信息与语义表示。实验结果表明,JFinder在所有基线方法中表现最优,在CWE数据集上准确率达到0.97,在PROMISE数据集上F1分数达到0.84。此外,案例研究表明,JFinder能够准确识别补丁修复后的四种漏洞案例。