Smart contracts are small programs on the blockchain that often handle valuable assets. Vulnerabilities in smart contracts can be costly, as time has shown over and over again. Countermeasures are high in demand and include best practice recommendations as well as tools supporting development, program verification, and post-deployment analysis. Many tools focus on detecting the absence or presence of a subset of the known vulnerabilities, delivering results of varying quality. Most comparative tool evaluations resort to selecting a handful of tools and testing them against each other. In the best case, the evaluation is based on a smallish ground truth. For Ethereum, there are commendable efforts by several author groups to manually classify contracts. However, a comprehensive ground truth is still lacking. In this work, we construct a ground truth based on publicly available benchmark sets for Ethereum smart contracts with manually checked ground truth data. We develop a method to unify these sets. Additionally, we devise strategies for matching entries that pertain to the same contract, such that we can determine overlaps and disagreements between the sets and consolidate the disagreements. Finally, we assess the quality of the included ground truth sets. Our work reduces inconsistencies, redundancies, and incompleteness while increasing the number of data points and heterogeneity.
翻译:智能合约是区块链上处理有价值资产的小型程序。时间一再证明,智能合约中的漏洞可能代价高昂。对此类漏洞的防护措施需求迫切,包括最佳实践建议以及支持开发、程序验证和部署后分析的工具。许多工具专注于检测已知漏洞子集的存在与否,但其结果质量参差不齐。大多数对比性工具评估局限于选取少量工具进行相互测试,理想情况下基于规模较小的真值集展开评估。针对以太坊,多个研究团队已付出值得称道的努力对合约进行人工分类,但全面覆盖的真值集仍然缺失。本研究基于以太坊智能合约的公开基准集,以手动核查的真值数据为基础构建真值集。我们提出统一这些数据集的方法,并设计匹配同一合约条目的策略,从而识别不同真值集之间的交集与分歧,并整合这些分歧。最后,我们对所纳入真值集的质量进行评估。本研究减少了不一致性、冗余性与不完整性,同时增加了数据点的数量与异质性。