Social media platforms are often blamed for exacerbating political polarization and worsening public dialogue. Many claim that hyperpartisan users post pernicious content, slanted to their political views, inciting contentious and toxic conversations. However, what factors are actually associated with increased online toxicity and negative interactions? In this work, we explore the role that partisanship and affective polarization play in contributing to toxicity both on an individual user level and a topic level on Twitter/X. To do this, we train and open-source a DeBERTa-based toxicity detector with a contrastive objective that outperforms the Google Jigsaw Perspective Toxicity detector on the Civil Comments test dataset. Then, after collecting 89.6 million tweets from 43,151 Twitter/X users, we determine how several account-level characteristics, including partisanship along the US left-right political spectrum and account age, predict how often users post toxic content. Fitting a Generalized Additive Model to our data, we find that the diversity of views and the toxicity of the other accounts with which that user engages has a more marked effect on their own toxicity. Namely, toxic comments are correlated with users who engage with a wider array of political views. Performing topic analysis on the toxic content posted by these accounts using the large language model MPNet and a version of the DP-Means clustering algorithm, we find similar behavior across 5,288 individual topics, with users becoming more toxic as they engage with a wider diversity of politically charged topics.
翻译:社交媒体平台常被指责加剧政治极化并恶化公共对话。许多人声称极端党派用户发布带有政治倾向的有害内容,煽动争议性和毒性对话。然而,哪些因素实际与网络毒性及负面互动的增加相关?本研究探讨了党派立场和情感极化在Twitter/X平台上对个体用户层面和话题层面毒性产生的双重作用。为此,我们训练并开源了一个基于DeBERTa的毒性检测器,该模型采用对比学习目标,在Civil Comments测试数据集上表现优于Google Jigsaw Perspective毒性检测器。通过收集来自43,151名Twitter/X用户的8960万条推文,我们分析了包括美国左右政治光谱中的党派立场和账户年龄在内的多项账户特征如何预测用户发布毒性内容的频率。通过拟合广义可加模型,我们发现用户互动对象的观点多样性及其毒性水平对其自身毒性具有更显著的影响:具体而言,毒性评论与那些接触更广泛政治观点的用户存在相关性。利用大语言模型MPNet和DP-Means聚类算法变体对这些账户发布的毒性内容进行主题分析,我们在5,288个独立主题中观察到相似规律——当用户涉足更多元化的政治敏感话题时,其毒性表现会相应增强。