The meme coin ecosystem has grown into one of the most active yet least observable segments of the cryptocurrency market, characterized by extreme churn, minimal project commitment, and widespread fraudulent behavior. While countless meme coins are deployed across multiple blockchains, they rely heavily on off-chain web and social infrastructure to signal legitimacy. These very signals are largely absent from existing datasets, which are often limited to single-chain data or lack the multimodal artifacts required for comprehensive risk modeling. To address this gap, we introduce MemeChain, a large-scale, open-source, cross-chain dataset comprising 34,988 meme coins across Ethereum, BNB Smart Chain, Solana, and Base. MemeChain integrates on-chain data with off-chain artifacts, including website HTML source code, token logos, and linked social media accounts, enabling multimodal and forensic study of meme coin projects. Analysis of the dataset shows that visual branding is frequently omitted in low-effort deployments, and many projects lack a functional website. Moreover, we quantify the ecosystem's extreme volatility, identifying 1,801 tokens (5.15%) that cease all trading activity within just 24 hours of launch. By providing unified cross-chain coverage and rich off-chain context, MemeChain serves as a foundational resource for research in financial forensics, multimodal anomaly detection, and automated scam prevention in the meme coin ecosystem.
翻译:模因币生态系统已发展成为加密货币市场中最为活跃但最难以观测的领域之一,其特点是极高的项目更替率、极低的项目承诺度以及普遍存在的欺诈行为。尽管无数模因币部署在多个区块链上,但它们严重依赖链下网络及社交基础设施来传递合法性信号。而这些关键信号在现有数据集中大多缺失,现有数据集通常局限于单链数据或缺乏全面风险建模所需的多模态信息。为填补这一空白,我们推出了MemeChain——一个大规模、开源、跨链的数据集,涵盖以太坊、BNB智能链、Solana和Base链上的34,988个模因币。MemeChain整合了链上数据与链下信息,包括网站HTML源代码、代币标识及关联的社交媒体账户,支持对模因币项目进行多模态与取证研究。对该数据集的分析表明,低质量部署项目常缺失视觉品牌设计,且许多项目缺乏功能性网站。此外,我们量化了该生态系统的极端波动性,发现1,801个代币(5.15%)在启动后仅24小时内即完全停止交易活动。通过提供统一的跨链覆盖与丰富的链下上下文信息,MemeChain为模因币生态系统中的金融取证、多模态异常检测及自动化诈骗防范研究提供了基础资源。