Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts, with traditional detection methods struggling to address emerging and machine-unauditable flaws. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection. We introduce a comprehensive dataset of 215 real-world DApp projects (4,998 contracts), including hard-to-detect logical errors like token price manipulation, addressing the limitations of existing simplified benchmarks. By fine-tuning LLMs (Llama3-8B and Qwen2-7B) with Full-Parameter Fine-Tuning (FFT) and Low-Rank Adaptation (LoRA), our method achieves superior performance, attaining an F1-score of 0.83 with FFT and data augmentation via Random Over Sampling (ROS). Comparative experiments demonstrate significant improvements over prompt-based LLMs and state-of-the-art tools. Notably, the approach excels in detecting non-machine-auditable vulnerabilities, achieving 0.97 precision and 0.68 recall for price manipulation flaws. The results underscore the effectiveness of domain-specific LLM fine-tuning and data augmentation in addressing real-world DApp security challenges, offering a robust solution for blockchain ecosystem protection.
翻译:去中心化应用(DApps)因智能合约中的漏洞而面临重大安全风险,传统检测方法难以应对新兴的、机器难以审计的缺陷。本文提出一种利用微调大语言模型(LLMs)来增强智能合约漏洞检测的新方法。我们引入了一个包含215个真实世界DApp项目(4,998份合约)的综合数据集,涵盖了诸如代币价格操纵等难以检测的逻辑错误,从而解决了现有简化基准的局限性。通过使用全参数微调(FFT)和低秩自适应(LoRA)技术对LLMs(Llama3-8B和Qwen2-7B)进行微调,我们的方法取得了卓越的性能,在使用FFT并通过随机过采样(ROS)进行数据增强后,F1分数达到了0.83。对比实验表明,该方法相较于基于提示的LLMs和最先进的工具均有显著提升。值得注意的是,该方法在检测机器难以审计的漏洞方面表现出色,对于价格操纵类缺陷达到了0.97的精确率和0.68的召回率。结果凸显了领域特定LLM微调与数据增强在应对真实世界DApp安全挑战方面的有效性,为区块链生态系统保护提供了一个稳健的解决方案。