In this work, we examine the linguistic signature of online racial microaggressions (acts) and how it differs from that of personal narratives recalling experiences of such aggressions (recalls) by Black social media users. We manually curate and annotate a corpus of acts and recalls from in-the-wild social media discussions, and verify labels with Black workshop participants. We leverage Natural Language Processing (NLP) and qualitative analysis on this data to classify (RQ1), interpret (RQ2), and characterize (RQ3) the language underlying acts and recalls of racial microaggressions in the context of racism in the U.S. Our findings show that neural language models (LMs) can classify acts and recalls with high accuracy (RQ1) with contextual words revealing themes that associate Blacks with objects that reify negative stereotypes (RQ2). Furthermore, overlapping linguistic signatures between acts and recalls serve functionally different purposes (RQ3), providing broader implications to the current challenges in content moderation systems on social media.
翻译:本研究考察了在线种族微歧视行为(acts)的语言特征,以及其与非裔社交媒体用户个人叙述中回忆此类歧视经历(recalls)的语言差异。我们手动整理并标注了一个来自真实社交媒体讨论的行为与回忆语料库,并通过非裔研讨会参与者验证了标签。我们利用自然语言处理(NLP)与定性分析方法,对美国种族主义背景下种族微歧视行为与回忆的语言进行分类(研究问题1)、解读(研究问题2)及特征描述(研究问题3)。研究结果表明,神经语言模型(LMs)能够以高准确率(研究问题1)对行为与回忆进行分类,其中上下文词汇揭示了将非裔群体与物化负面刻板印象关联的主题(研究问题2)。此外,行为与回忆之间重叠的语言特征具有不同的功能目的(研究问题3),这为当前社交媒体内容审核系统面临的挑战提供了更广泛的启示。