数字森林中的面包屑：通过OSINT追踪洪流元数据中的罪犯 (Breadcrumbs in the Digital Forest: Tracing Criminals through Torrent Metadata with OSINT)

This work investigates the potential of torrent metadata as a source for open-source intelligence (OSINT), with a focus on user profiling and behavioral analysis. While peer-to-peer (P2P) networks such as BitTorrent are well studied with respect to privacy and performance, their metadata is rarely used for investigative purposes. This work presents a proof of concept demonstrating how tracker responses, torrent index data, and enriched IP metadata can reveal patterns associated with high-risk behavior. The research follows a five-step OSINT process: source identification, data collection, enrichment, behavioral analysis, and presentation of the results. Data were collected from The Pirate Bay and UDP trackers, yielding a dataset of more than 60,000 unique IP addresses across 206 popular torrents. The data were enriched with geolocation, anonymization status, and flags of involvement in child exploitation material (CEM). A case study on sensitive e-books shows how such data can help detect possible interest in illicit content. Network analysis highlights peer clustering, co-download patterns, and the use of privacy tools by suspicious users. The study shows that publicly available torrent metadata can support scalable and automated OSINT profiling. This work adds to digital forensics by proposing a new method to extract useful signals from noisy data, with applications in law enforcement, cybersecurity, and threat analysis.

翻译：本研究探讨了洪流元数据作为开源情报（OSINT）来源的潜力，重点关注用户画像和行为分析。尽管BitTorrent等点对点（P2P）网络在隐私和性能方面已得到充分研究，但其元数据很少用于调查目的。本研究提出了一个概念验证，展示了跟踪器响应、洪流索引数据和增强的IP元数据如何揭示与高风险行为相关的模式。研究遵循五步OSINT流程：来源识别、数据收集、增强、行为分析和结果呈现。数据收集自The Pirate Bay和UDP跟踪器，获得了涵盖206个热门洪流的超过60,000个独立IP地址的数据集。数据通过地理位置、匿名化状态以及涉及儿童剥削材料（CEM）的标记进行了增强。针对敏感电子书的案例研究表明，此类数据如何帮助检测对非法内容的潜在兴趣。网络分析突出了可疑用户的节点聚类、共同下载模式以及隐私工具的使用情况。研究表明，公开可用的洪流元数据可以支持可扩展和自动化的OSINT画像。本研究通过提出一种从噪声数据中提取有用信号的新方法，为数字取证领域做出贡献，该方法可应用于执法、网络安全和威胁分析。

相关内容

元数据

关注 7

元数据（Metadata），又称元数据、中介数据、中继数据[来源请求]，为描述数据的数据（data about data），主要是描述数据属性（property）的信息，用来支持如指示存储位置、历史数据、资源查找、文件纪录等功能。元数据算是一种电子式目录，为了达到编制目录的目的，必须在描述并收藏数据的内容或特色，进而达成协助数据检索的目的。

《自动化、开源情报（OSINT）和人工智能时代新闻调查的未来》388页

专知会员服务

37+阅读 · 2024年5月11日

《用于跟踪美国防部（DoD）预算支出的区块链数据结构》23页报告

专知会员服务

32+阅读 · 2023年8月10日

《21世纪开源情报和执法应用》2023最新113页论文

专知会员服务

62+阅读 · 2023年5月9日

《基于战术区块链提供数据来源以支持战场物联网和大数据分析》美海军2022最新94页论文

专知会员服务

76+阅读 · 2022年12月13日