We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist ideology; tracking the relationships between online subcultures, extremist behaviours, and real-world violence; and monitoring extremist communities in near real time. This can shed light not only on the spread of problematic ideologies but also the effectiveness of interventions. ExtremeBB comes with a robust ethical data-sharing regime that allows us to share data with academics worldwide. Since 2020, access has been granted to 49 licensees in 16 research groups from 12 institutions.
翻译:我们推出ExtremeBB,这是一个包含逾5350万条帖文的文本数据库,由12个宣扬网络仇恨、骚扰、男性圈及其他极端主义形式的极端主义论坛公告板上的3.85万名用户发布。该数据库支持对过去二十年的定性与定量历史趋势开展大规模分析:包括度量仇恨言论与毒性程度;追踪不同极端主义意识形态脉络的演变;跟踪网络亚文化、极端主义行为与现实世界暴力之间的关联;以及近乎实时地监测极端主义社群。这不仅能揭示问题意识形态的传播路径,还能评估干预措施的有效性。ExtremeBB配备了健全的道德数据共享机制,使我们能够与全球学者共享数据。自2020年以来,已向来自12个机构16个研究组的49名被许可方授予数据访问权限。