We present the Multi-Modal Discussion Transformer (mDT), a novel multi-modal graph-based transformer model for detecting hate speech in online social networks, such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the entire discussion surrounding a comment and grounding the interwoven fusion layers that combine individual comments' text and image embeddings instead of processing modalities separately. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We conclude with future work for multimodal solutions to deliver social value in online contexts, arguing that capturing a holistic view of a conversation significantly advances the effort to detect anti-social behaviour.
翻译:我们提出多模态讨论转换器(mDT),一种基于图结构的新型多模态转换器模型,用于检测在线社交网络(如Reddit讨论)中的仇恨言论。与传统仅依赖评论的方法不同,我们的方法在标注评论为仇恨言论时,需基于讨论语境对文本和图像进行整体分析。具体通过利用图转换器捕获评论所处完整讨论中的上下文关系,并构建交织融合层——该层将单个评论的文本与图像嵌入结合处理,而非独立处理各模态。我们对比了仅处理单个评论的基线模型性能,并进行了广泛的消融研究。为评估工作,我们发布了新数据集HatefulDiscussions,包含Reddit多个在线社区的多模态完整讨论。最后,我们展望了多模态解决方案在社会价值层面的未来方向,指出捕获对话整体视角能显著推进反社会行为检测工作。