We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.
翻译:我们提出多模态讨论Transformer(mDT)——一种用于检测在线社交网络(如Reddit讨论)中仇恨言论的新方法。与传统的仅基于评论的检测方法不同,本方法在判定某条评论是否包含仇恨言论时,基于讨论上下文对文本与图像进行整体性分析。具体通过图Transformer捕捉评论周围讨论情境中的上下文关系,并将交织融合层作为文本与图像嵌入的结合基础,而非分别处理各模态。为评估本研究,我们构建了新数据集HatefulDiscussions,该数据集包含Reddit多个在线社区完整的多模态讨论记录。我们将模型性能与仅处理单条评论的基准方法进行对比,并开展了全面的消融研究。