ContractShield: Bridging Semantic-Structural Gaps via Hierarchical Cross-Modal Fusion for Multi-Label Vulnerability Detection in Obfuscated Smart Contracts

4 月 3 日

翻译：ContractShield：通过分层跨模态融合弥合语义-结构间隙的混淆智能合约多标签漏洞检测

Minh-Dai Tran-Duong,Nguyen Hai Phong,Nguyen Chi Thanh,Doan Minh Trung,Tram Truong-Huu,Van-Hau Pham,Phan The Duy

from arxiv, 9 figures, 8 tables, 16 pages

Smart contracts are increasingly targeted by adversaries employing obfuscation techniques such as bogus code injection and control flow manipulation to evade vulnerability detection. Existing multimodal methods often process semantic, temporal, and structural features in isolation and fuse them using simple strategies such as concatenation, which neglects cross-modal interactions and weakens robustness, as obfuscation of a single modality can sharply degrade detection accuracy. To address these challenges, we propose ContractShield, a robust multimodal framework with a novel fusion mechanism that effectively correlates multiple complementary features through a three-level fusion. Self-attention first identifies patterns that indicate vulnerability within each feature space. Cross-modal attention then establishes meaningful connections between complementary signals across modalities. Then, adaptive weighting dynamically calibrates feature contributions based on their reliability under obfuscation. For feature extraction, ContractShield integrates (1) CodeBERT with a sliding window mechanism to capture semantic dependencies in source code, (2) Extended long short-term memory (xLSTM) to model temporal dynamics in opcode sequences, and (3) GATv2 to identify structural invariants in control flow graphs (CFGs) that remain stable across obfuscation. Empirical evaluation demonstrates resilience of ContractShield, achieving a 89 percentage Hamming Score with only a 1-3 percentage drop compared to non-obfuscated data. The framework simultaneously detects five major vulnerability types with 91 percentage F1-score, outperforming state-of-the-art approaches by 6-15 percentage under adversarial conditions.

翻译：智能合约日益受到采用虚假代码注入和控制流操纵等混淆技术的攻击者针对，以规避漏洞检测。现有的多模态方法通常孤立处理语义、时序和结构特征，并通过拼接等简单策略融合，忽略了跨模态交互，且削弱了鲁棒性，因为单模态混淆可急剧降低检测准确率。为应对这些挑战，我们提出ContractShield，一种鲁棒的多模态框架，其新颖的融合机制通过三级融合有效关联多个互补特征。自注意力首先识别每个特征空间内指示漏洞的模式；跨模态注意力随后在模态间的互补信号间建立有意义的连接；最后，自适应权重根据特征在混淆下的可靠性动态校准其特征贡献。在特征提取方面，ContractShield集成：(1) CodeBERT与滑动窗口机制以捕捉源代码中的语义依赖关系，(2) 扩展长短期记忆网络(xLSTM)以建模操作码序列中的时序动态，以及(3) GATv2以识别控制流图中保持稳定的结构不变量。实证评估展示了ContractShield的鲁棒性，其汉明得分达89%，相比未混淆数据仅下降1-3%。该框架同时检测五种主要漏洞类型，F1得分达91%，在对抗条件下比现有最优方法高出6-15%。