Source code vulnerability detection remains a long-standing challenge due to the increasing scale, structural complexity, and semantic diversity of modern codebases. Conventional static-analysis or rule-based approaches often fail to capture subtle execution dependencies, while single-modality learning models tend to overlook critical structural information embedded beyond the lexical surface of source code. To improve robustness across heterogeneous code patterns, we propose FusionVul, a joint representation learning framework that integrates sequential syntactic representations extracted by a pretrained Transformer encoder with structural semantics propagated through a graph neural network. The framework further incorporates a cross-attention-based feature fusion network to enable fine-grained cross-modal interaction and employs a sample-aware weighting mechanism to integrate multiple predictive branches. Experimental results on four datasets demonstrate that FusionVul achieves superior F1 scores on datasets with highly dispersed function size distributions and broader vulnerability-type coverage, such as SVulD and DiverseVul, reflecting its capability to capture complex and diverse vulnerability patterns.
翻译:源代码漏洞检测因其现代代码库规模日益增长、结构复杂性提升以及语义多样性增强而长期面临挑战。传统静态分析或基于规则的方法往往难以捕捉细微的执行依赖关系,而单模态学习模型则容易忽视嵌入在源代码词法表层的关键结构信息。为提升跨异构代码模式的鲁棒性,我们提出FusionVul——一种联合表示学习框架,该框架将预训练Transformer编码器提取的序列化句法表示与通过图神经网络传播的结构化语义进行整合。该框架进一步引入基于交叉注意力的特征融合网络以实现细粒度跨模态交互,并采用样本感知加权机制整合多个预测分支。在四个数据集上的实验表明,FusionVul在函数规模分布高度分散且漏洞类型覆盖更广的数据集(如SVulD和DiverseVul)上取得了更优的F1分数,验证了其捕获复杂多样漏洞模式的能力。