As various forms of fraud proliferate on Ethereum, it is imperative to safeguard against these malicious activities to protect susceptible users from being victimized. While current studies solely rely on graph-based fraud detection approaches, it is argued that they may not be well-suited for dealing with highly repetitive, skew-distributed and heterogeneous Ethereum transactions. To address these challenges, we propose BERT4ETH, a universal pre-trained Transformer encoder that serves as an account representation extractor for detecting various fraud behaviors on Ethereum. BERT4ETH features the superior modeling capability of Transformer to capture the dynamic sequential patterns inherent in Ethereum transactions, and addresses the challenges of pre-training a BERT model for Ethereum with three practical and effective strategies, namely repetitiveness reduction, skew alleviation and heterogeneity modeling. Our empirical evaluation demonstrates that BERT4ETH outperforms state-of-the-art methods with significant enhancements in terms of the phishing account detection and de-anonymization tasks. The code for BERT4ETH is available at: https://github.com/git-disl/BERT4ETH.
翻译:随着以太坊上各类欺诈行为的泛滥,必须防范这些恶意活动以保护易受攻击的用户免受侵害。现有研究仅依赖基于图的欺诈检测方法,但本文认为这些方法可能不适用于处理以太坊交易中高度重复、偏斜分布及异构的特点。为解决上述挑战,我们提出BERT4ETH——一种通用的预训练Transformer编码器,可作为账户表示提取器用于检测以太坊上的多种欺诈行为。BERT4ETH利用Transformer优越的建模能力来捕获以太坊交易中固有的动态序列模式,并通过三种实用且有效的策略(即重复性减少、偏斜缓解和异构性建模)来解决以太坊上预训练BERT模型所面临的挑战。实证评估表明,在钓鱼账户检测和去匿名化任务中,BERT4ETH以显著提升的性能超越了现有最先进方法。BERT4ETH的代码开源地址为:https://github.com/git-disl/BERT4ETH。