Launchpads have become the dominant mechanism for issuing memecoins on blockchains due to their fully automated, no-code creation process. This new issuance paradigm has led to a surge in high-risk token launches, causing substantial financial losses for unsuspecting buyers. In this paper, we introduce MemeTrans, the first dataset for studying and detecting high-risk memecoin launches on Solana. MemeTrans covers over 40k memecoin launches that successfully migrated to the public Decentralized Exchange (DEX), with over 30 million transactions during the initial sale on launchpad and 180 million transactions after migration. To precisely capture launch patterns, we design 122 features spanning dimensions such as context, trading activity, holding concentration, and time-series dynamics, supplemented with bundle-level data that reveals multiple accounts controlled by the same entity. Finally, we introduce an annotation approach to label the risk level of memecoin launches, which combines statistical indicators with a manipulation-pattern detector. Experiments on the introduced high-risk launch detection task suggest that designed features are informative for capturing high-risk patterns and ML models trained on MemeTrans can effectively reduce financial loss by 56.1%. Our dataset, experimental code, and pipeline are publicly available at: https://github.com/git-disl/MemeTrans.
翻译:由于启动平台提供完全自动化、无需编码的创建流程,其已成为区块链上发行模因币的主导机制。这种新型发行范式导致了高风险代币发行的激增,给毫无戒心的买家造成了重大财务损失。本文介绍了MemeTrans,这是首个用于研究和检测Solana上高风险模因币发行的数据集。MemeTrans覆盖了超过4万次成功迁移至公共去中心化交易所(DEX)的模因币发行事件,包含启动平台初始销售期间的超过3000万笔交易以及迁移后的1.8亿笔交易。为精确捕捉发行模式,我们设计了涵盖上下文、交易活动、持仓集中度和时间序列动态等维度的122个特征,并辅以揭示同一实体控制的多个账户的捆绑级数据。最后,我们提出了一种结合统计指标与操纵模式检测器的标注方法,用于标记模因币发行的风险等级。在引入的高风险发行检测任务上的实验表明,所设计的特征能有效捕捉高风险模式,基于MemeTrans训练的机器学习模型可将财务损失降低56.1%。我们的数据集、实验代码与处理流程已公开于:https://github.com/git-disl/MemeTrans。