We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.
翻译:我们提出多模态讨论转换器(mDT),一种用于在线社交网络(如Reddit讨论)中检测仇恨言论的新方法。与传统的仅基于评论的方法不同,我们的方法在识别评论是否为仇恨言论时,综合考虑文本与图像,并以讨论语境为基础展开全面分析。具体而言,通过利用图转换器捕获评论所在讨论中的上下文关系,并构建交织融合层(而非分别处理模态),实现文本嵌入与图像嵌入的联合整合。为评估我们的工作,我们构建了新数据集HatefulDiscussions,其中包含来自Reddit多个在线社区的完整多模态讨论。我们将模型性能与仅处理单条评论的基线方法进行对比,并开展了广泛的消融实验。