Evolution of Automated Weakness Detection in Ethereum Bytecode: a Comprehensive Study

Blockchain programs (also known as smart contracts) manage valuable assets like cryptocurrencies and tokens, and implement protocols in domains like decentralized finance (DeFi) and supply-chain management. These types of applications require a high level of security that is hard to achieve due to the transparency of public blockchains. Numerous tools support developers and auditors in the task of detecting weaknesses. As a young technology, blockchains and utilities evolve fast, making it challenging for tools and developers to keep up with the pace. In this work, we study the robustness of code analysis tools and the evolution of weakness detection on a dataset representing six years of blockchain activity. We focus on Ethereum as the crypto ecosystem with the largest number of developers and deployed programs. We investigate the behavior of single tools as well as the agreement of several tools addressing similar weaknesses. Our study is the first that is based on the entire body of deployed bytecode on Ethereum's main chain. We achieve this coverage by considering bytecodes as equivalent if they share the same skeleton. The skeleton of a bytecode is obtained by omitting functionally irrelevant parts. This reduces the 48 million contracts deployed on Ethereum up to January 2022 to 248328 contracts with distinct skeletons. For bulk execution, we utilize the open-source framework SmartBugs that facilitates the analysis of Solidity smart contracts, and enhance it to accept also bytecode as the only input. Moreover, we integrate six further tools for bytecode analysis. The execution of the 12 tools included in our study on the dataset took 30 CPU years. While the tools report a total of 1307486 potential weaknesses, we observe a decrease in reported weaknesses over time, as well as a degradation of tools to varying degrees.

翻译：区块链程序（亦称智能合约）管理着加密货币和代币等宝贵资产，并在去中心化金融（DeFi）及供应链管理等领域实现各类协议。此类应用要求极高安全性，但由于公有区块链的透明性，这一目标难以达成。大量工具可为开发者和审计人员检测漏洞提供支持。作为新兴技术，区块链及其工具生态演进迅速，使得工具与开发者难以同步跟进。本文基于代表六年区块链活动的数据集，研究了代码分析工具的鲁棒性及弱点检测的演变过程。我们聚焦于以太坊——这一拥有最多开发者及部署程序的加密生态系统。我们考察了单一工具的行为表现，以及针对相似弱点进行检测的多款工具之间的一致性。本研究首次基于以太坊主链上已部署的全部字节码展开分析。我们通过将具有相同骨架的字节码视为等价来实现这一覆盖范围。字节码骨架通过省略功能无关部分获得。此举将截至2022年1月以太坊上部署的4800万个合约缩减至248328个具有不同骨架的合约。为实现批量执行，我们利用开源框架SmartBugs（该框架支持Solidity智能合约分析），并对其进行增强以接受仅含字节码的输入。此外，我们集成了另外六款字节码分析工具。本研究纳入的12款工具在数据集上的执行耗时30 CPU年。尽管工具共报告了1307486个潜在弱点，但我们观察到随时间推移报告弱点的数量下降，且工具性能出现不同程度退化。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日