Software vulnerability detection (SVD) is a critical challenge in modern systems. Large language models (LLMs) offer natural-language explanations alongside predictions, but most work focuses on binary evaluation, and explanations often lack semantic consistency with Common Weakness Enumeration (CWE) categories. We propose VulReaD, a knowledge-graph-guided approach for vulnerability reasoning and detection that moves beyond binary classification toward CWE-level reasoning. VulReaD leverages a security knowledge graph (KG) as a semantic backbone and uses a strong teacher LLM to generate CWE-consistent contrastive reasoning supervision, enabling student model training without manual annotations. Students are fine-tuned with Odds Ratio Preference Optimization (ORPO) to encourage taxonomy-aligned reasoning while suppressing unsupported explanations. Across three real-world datasets, VulReaD improves binary F1 by 8-10% and multi-class classification by 30% Macro-F1 and 18% Micro-F1 compared to state-of-the-art baselines. Results show that LLMs outperform deep learning baselines in binary detection and that KG-guided reasoning enhances CWE coverage and interpretability.
翻译:软件漏洞检测(SVD)是现代系统面临的一项关键挑战。大语言模型(LLMs)能够提供预测结果的同时给出自然语言解释,但现有工作大多聚焦于二元评估,且其解释通常与通用缺陷枚举(CWE)类别缺乏语义一致性。我们提出了VulReaD,一种基于知识图谱引导的漏洞推理与检测方法,该方法超越了二元分类,实现了CWE层级的推理。VulReaD利用安全知识图谱(KG)作为语义主干,并借助强教师LLM生成与CWE保持一致的对比推理监督信号,从而实现在无需人工标注的情况下训练学生模型。学生模型通过比值比偏好优化(ORPO)进行微调,以鼓励符合分类体系的推理,同时抑制缺乏支持的推测性解释。在三个真实数据集上的实验表明,相较于最先进的基线方法,VulReaD在二元检测的F1分数上提升了8-10%,在多分类任务上的宏平均F1和微平均F1分别提升了30%和18%。结果表明,LLMs在二元检测任务上优于深度学习基线方法,且知识图谱引导的推理显著增强了CWE覆盖范围与结果的可解释性。