This paper presents TeleHunt, a framework and tool for evaluating the effectiveness of different strategies to discover cybercriminal communities on Telegram. TeleHunt employs a set of reference-driven snowballing strategies, integrating message-level classification, contextual filtering, and market-segment labeling. Using open- and dark-web seeds, we systematically evaluate how seed source, pointer type, and exploration strategy influence discovery outcomes in three dimensions: efficiency, accessibility, and rediscovery. Our work provides (i) a modular cybercrime content discovery pipeline, (ii) the first systematic comparison of Telegram discovery strategies with an empirical characterization of market-segment accessibility, and (iii) a labeled dataset of over 172 million messages from 6,022 Telegram communities.
翻译:本文提出TeleHunt——一个用于评估不同策略在Telegram上发现网络犯罪社区效果的框架与工具。TeleHunt采用一组参考驱动的滚雪球策略,融合了消息级分类、上下文过滤和市场细分标注。通过使用开放网络和暗网种子,我们系统性地评估了种子来源、指针类型和探索策略如何影响三个维度的发现结果:效率、可访问性和重发现率。本研究贡献包括:(i) 模块化网络犯罪内容发现流水线,(ii) 首次对Telegram发现策略进行系统性比较,并实证刻画市场细分可访问性特征,以及(iii) 一个包含来自6,022个Telegram社区超1.72亿条消息的标注数据集。