Hate speech on social media threatens the mental and physical well-being of individuals and is further responsible for real-world violence. An important driver behind the spread of hate speech and thus why hateful posts can go viral are reshares, yet little is known about why users reshare hate speech. In this paper, we present a comprehensive, causal analysis of the user attributes that make users reshare hate speech. However, causal inference from observational social media data is challenging, because such data likely suffer from selection bias, and there is further confounding due to differences in the vulnerability of users to hate speech. We develop a novel, three-step causal framework: (1) We debias the observational social media data by applying inverse propensity scoring. (2) We use the debiased propensity scores to model the latent vulnerability of users to hate speech as a latent embedding. (3) We model the causal effects of user attributes on users' probability of sharing hate speech, while controlling for the latent vulnerability of users to hate speech. Compared to existing baselines, a particular strength of our framework is that it models causal effects that are non-linear, yet still explainable. We find that users with fewer followers, fewer friends, and fewer posts share more hate speech. Younger accounts, in return, share less hate speech. Overall, understanding the factors that drive users to share hate speech is crucial for detecting individuals at risk of engaging in harmful behavior and for designing effective mitigation strategies.
翻译:社交媒体上的仇恨言论威胁个人的身心健康,并进一步导致现实世界中的暴力事件。仇恨言论传播的重要驱动因素——即为何仇恨帖文能够病毒式传播——在于转发行为,然而用户为何转发仇恨言论的机制尚不明确。本文对促使转发仇恨言论的用户属性进行了全面的因果分析。然而,基于观察性社交媒体数据的因果推断面临挑战,因为此类数据可能存在选择偏差,且用户对仇恨言论的脆弱性差异会带来混杂效应。我们开发了一种新颖的三步因果框架:(1) 通过应用逆倾向评分对观察性社交媒体数据进行去偏处理;(2) 利用去偏后的倾向得分将用户对仇恨言论的潜在脆弱性建模为潜在嵌入;(3) 在控制用户对仇恨言论的潜在脆弱性的前提下,建模用户属性对其转发仇恨言论概率的因果效应。与现有基线方法相比,本框架的独特优势在于能够建模非线性但可解释的因果效应。研究发现:粉丝数较少、好友数较少且发帖量较少的用户更倾向于转发仇恨言论;而较年轻的账号则转发较少。整体而言,理解驱动用户转发仇恨言论的因素,对于识别存在有害行为风险的个体以及设计有效的缓解策略至关重要。