Ethereum smart contracts, which are autonomous decentralized applications on the blockchain that manage assets often exceeding millions of dollars, have become primary targets for cyberattacks. In 2023 alone, such vulnerabilities led to substantial financial losses exceeding a billion of US dollars. To counter these threats, various tools have been developed by academic and commercial entities to detect and mitigate vulnerabilities in smart contracts. Our study investigates the gap between the effectiveness of existing security scanners and the vulnerabilities that still persist in practice. We compiled four distinct datasets for this analysis. The first dataset comprises 77,219 source codes extracted directly from the blockchain, while the second includes over 4 million bytecodes obtained from Ethereum Mainnet and testnets. The other two datasets consist of nearly 14,000 manually annotated smart contracts and 373 smart contracts verified through audits, providing a foundation for a rigorous ground truth analysis on bytecode and source code. Using the unlabeled datasets, we conducted a comprehensive quantitative evaluation of 17 vulnerability scanners, revealing considerable discrepancies in their findings. Our analysis of the ground truth datasets indicated poor performance across all the tools we tested. This study unveils the reasons for poor performance and underscores that the current state of the art for smart contract security falls short in effectively addressing open problems, highlighting that the challenge of effectively detecting vulnerabilities remains a significant and unresolved issue.
翻译:以太坊智能合约作为区块链上管理着常超数百万美元资产的自洽去中心化应用,已成为网络攻击的主要目标。仅2023年,此类漏洞就导致超过十亿美元的重大经济损失。为应对这些威胁,学术界和商业机构开发了多种工具来检测和缓解智能合约中的漏洞。本研究探究了现有安全扫描器的有效性与实践中持续存在的漏洞之间的差距。为此,我们构建了四个独立数据集:首个数据集包含直接从区块链提取的77,219份源代码,第二个数据集涵盖从以太坊主网及测试网获取的超过400万份字节码。另外两个数据集分别包含近14,000份经人工标注的智能合约和373份通过审计验证的智能合约,为基于字节码和源代码的严格基准分析奠定了基础。利用未标注数据集,我们对17款漏洞扫描器进行了全面量化评估,发现其检测结果存在显著差异。基于基准数据集的分析表明,所有被测工具的表现均不理想。本研究揭示了性能低下的根本原因,强调当前智能合约安全技术的先进水平在有效解决开放性问题方面仍存在不足,指出漏洞检测这一挑战仍是一个重大的未解难题。