Smart contracts are small programs on the blockchain that often handle valuable assets. Vulnerabilities in smart contracts can be costly, as time has shown over and over again. Countermeasures are high in demand and include best practice recommendations as well as tools supporting development, program verification, and post-deployment analysis. Many tools focus on detecting the absence or presence of a subset of the known vulnerabilities, delivering results of varying quality. Most comparative tool evaluations resort to selecting a handful of tools and testing them against each other. In the best case, the evaluation is based on a smallish ground truth. For Ethereum, there are commendable efforts by several author groups to manually classify contracts. However, a comprehensive ground truth is still lacking. In this work, we construct a ground truth based on publicly available benchmark sets for Ethereum smart contracts with manually checked ground truth data. We develop a method to unify these sets. Additionally, we devise strategies for matching entries that pertain to the same contract, such that we can determine overlaps and disagreements between the sets and consolidate the disagreements. Finally, we assess the quality of the included ground truth sets. Our work reduces inconsistencies, redundancies, and incompleteness while increasing the number of data points and heterogeneity.
翻译:智能合约是区块链上处理有价值资产的小型程序。历史反复证明,智能合约中的漏洞可能代价高昂。对此类问题的解决方案需求迫切,包括最佳实践建议以及支持开发、程序验证和部署后分析的工具。许多工具专注于检测已知漏洞子集的存在与否,但其输出结果质量参差不齐。大多数工具对比评估仅选取少量工具进行相互测试,即便在最理想情况下,评估所依赖的真值集规模也较为有限。针对以太坊,多个研究团队已投入值得称道的努力以人工分类合约,然而,一个全面的真值集仍然缺失。本研究基于公开可用的以太坊智能合约基准集(附带人工验证的真值数据)构建真值集。我们提出了一种统一这些数据集的方法,并设计了匹配同一合约对应条目的策略,以识别数据集之间的重叠与分歧,进而整合这些分歧。最后,我们评估了所纳入真值集的质量。本研究减少了不一致性、冗余和不完整性问题,同时增加了数据点的数量和异质性。