Smart contracts are important for digital finance, yet they are hard to patch once deployed. Prior work mostly studies LLMs for vulnerability detection, leaving their automated exploit generation (AEG) capability unclear. This paper closes that gap with \textsc{ReX}, a framework that links LLM-based exploit synthesis to the Foundry stack for end-to-end generation, compilation, execution, and verification. Five recent LLMs are evaluated across eight common vulnerability classes, supported by a curated dataset of 38{+} real incident PoCs and three automation aids: prompt refactoring, a compiler feedback loop, and templated test harnesses. Results indicate strong performance on single-contract PoCs and weak performance on cross-contract attacks; outcomes depend mainly on the model and bug type, with code structure and prompt tuning contributing little. The study also surfaces gaps in current defenses against LLM-driven AEG, pointing to the need for stronger protections.
翻译:智能合约在数字金融领域至关重要,然而一旦部署便难以修补。现有研究主要关注利用大型语言模型进行漏洞检测,对其自动化漏洞利用生成能力尚不明确。本文通过提出 \textsc{ReX} 框架填补了这一空白,该框架将基于大型语言模型的漏洞利用合成与 Foundry 工具链相连接,实现端到端的生成、编译、执行与验证。研究选取五种近期发布的大型语言模型,在八类常见漏洞上进行了评估,并构建了包含 38{+}个真实事件概念验证的数据集,辅以三项自动化辅助机制:提示重构、编译器反馈循环和模板化测试框架。结果表明,模型在单合约概念验证上表现强劲,但在跨合约攻击中表现较弱;生成效果主要取决于模型类型和漏洞类别,而代码结构和提示调优的影响甚微。本研究同时揭示了当前防御机制在应对大型语言模型驱动的自动化漏洞利用生成方面存在的不足,指出了加强防护措施的必要性。