BitTorrent remains a prominent channel for illicit distribution of copyrighted material, yet the supply side of such content remains understudied. We introduce MagnetDB, a longitudinal dataset of torrents discovered through the BitTorrent DHT between 2018 and 2024, containing more than 28.6 million torrents and metadata of more than 950 million files. While our primary focus is on enabling research based on the supply of pirated movies and TV shows, the dataset also encompasses other legitimate and illegitimate torrents. By applying IMDb-matching and annotation to movie and TV show torrents, MagnetDB facilitates detailed analyses of pirated content evolution in the BitTorrent network. Researchers can leverage MagnetDB to examine distribution trends, subcultural practices, and the gift economy within piracy ecosystems. Through its scale and temporal scope, MagnetDB presents a unique opportunity for investigating the broader dynamics of BitTorrent and advancing empirical knowledge on digital piracy.
翻译:BitTorrent 仍然是受版权保护材料非法分发的重要渠道,然而此类内容的供应方仍未得到充分研究。我们介绍了 MagnetDB,这是一个通过 BitTorrent DHT 在 2018 年至 2024 年间发现的种子纵向数据集,包含超过 2860 万个种子及超过 9.5 亿个文件的元数据。虽然我们的主要重点是支持基于盗版电影和电视节目供应的研究,但该数据集也涵盖其他合法与非法的种子。通过对电影和电视节目种子应用 IMDb 匹配与标注,MagnetDB 为分析 BitTorrent 网络中盗版内容的演变提供了详细支持。研究人员可利用 MagnetDB 考察盗版生态系统内的分发趋势、亚文化实践及礼物经济。凭借其规模与时间跨度,MagnetDB 为研究 BitTorrent 的更广泛动态及推进数字盗版实证认知提供了独特机遇。