An Effective and Cost-Efficient Agentic Framework for Ethereum Smart Contract Auditing

Smart contract security is paramount, but identifying intricate business logic vulnerabilities remains a persistent challenge because existing solutions consistently fall short: manual auditing is unscalable, static analysis tools are plagued by false positives, and fuzzers struggle to navigate deep logic states within complex systems. Even emerging AI-based methods suffer from hallucinations, context constraints, and a heavy reliance on expensive, proprietary Large Language Models. In this paper, we introduce Heimdallr, an automated auditing agent designed to overcome these hurdles through four core innovations. By reorganizing code at the function level, Heimdallr minimizes context overhead while preserving essential business logic. It then employs heuristic reasoning to detect complex vulnerabilities and automatically chain functional exploits. Finally, a cascaded verification layer validates these findings to eliminate false positives. Notably, this approach achieves high performance on lightweight, open-source models like GPToss-120B without relying on proprietary systems. Our evaluations demonstrate exceptional performance, as Heimdallr successfully reconstructed 17 out of 20 real-world attacks post June 2025, resulting in total losses of $384M, and uncovered 4 confirmed zero-day vulnerabilities that safeguarded $400M in TVL. Compared to SOTA baselines including both official industrial tools and academic tools, Heimdallr at most reduces analysis time by 97.59% and financial costs by 98.77% while boosting detection precision by over 93.66%. Notably, when applied to auditing contests, Heimdallr can achieve a 92.45% detection rate at a negligible cost of $2.31 per 10K LOC. We provide production-ready auditing services and release valuable benchmarks for future work.

翻译：智能合约安全性至关重要，但识别复杂的业务逻辑漏洞仍是一项持续挑战，因为现有解决方案始终存在不足：人工审计难以扩展，静态分析工具饱受误报困扰，而模糊测试器难以在复杂系统中遍历深层逻辑状态。即使是新兴的基于人工智能的方法也存在幻觉问题、上下文限制，并且严重依赖昂贵、专有的大型语言模型。本文中，我们介绍Heimdallr，一种旨在通过四项核心创新克服这些障碍的自动化审计代理。通过在函数级别重组代码，Heimdallr在保留必要业务逻辑的同时最小化上下文开销。随后，它采用启发式推理来检测复杂漏洞并自动链接功能利用链。最后，一个级联验证层对这些发现进行验证以消除误报。值得注意的是，该方法在轻量级开源模型（如GPToss-120B）上实现了高性能，而无需依赖专有系统。我们的评估展示了卓越的性能：Heimdallr成功复现了2025年6月后20起真实攻击中的17起（造成总损失3.84亿美元），并发现了4个已确认的零日漏洞，保护了4亿美元的总锁定价值。与包括官方工业工具和学术工具在内的最先进基线相比，Heimdallr最多可将分析时间减少97.59%，财务成本降低98.77%，同时将检测精度提升超过93.66%。值得注意的是，当应用于审计竞赛时，Heimdallr能以每万行代码2.31美元的可忽略成本实现92.45%的检测率。我们提供生产就绪的审计服务，并为未来工作发布有价值的基准测试集。