How can citizens moderate hate, toxicity, and extremism in online discourse? We analyze a large corpus of more than 130,000 discussions on German Twitter over the turbulent four years marked by the migrant crisis and political upheavals. With the help of human annotators, language models and machine learning classifiers, we identify different dimensions of discourse. We use a matching approach and longitudinal statistical analyses to discern the effectiveness of different counter speech strategies on the micro-level (individual tweet pairs), meso-level (discussion trees) and macro-level (days) of discourse. We find that expressing simple opinions, not necessarily supported by facts, but also without insults, relates to the least hate, toxicity, and extremity of speech and speakers in subsequent discussions. Sarcasm also helps in achieving those outcomes, in particular in the presence of organized extreme groups on the meso-level. Constructive comments such as providing facts or exposing contradictions can backfire and attract more extremity. Mentioning either outgroups or ingroups is typically related to a deterioration of discourse. A pronounced emotional tone, either negative such as anger or fear, or positive such as enthusiasm and pride, also leads to worse outcomes. Going beyond one-shot analyses on smaller samples of discourse, our findings have implications for the successful management of online commons through collective civic moderation.
翻译:公民如何通过集体行动调节在线讨论中的仇恨、极端言论与有害内容?我们分析了德国 Twitter 上跨越移民危机与政治动荡四年的13万余条讨论数据。借助人工标注、语言模型与机器学习分类器,我们识别了话语的不同维度。采用匹配方法学与纵向统计分析,我们从微观(单条推文对)、中观(讨论树结构)与宏观(日度)三个话语层级评估不同反言论策略的有效性。研究发现:表达简单观点(无需事实支撑但避免人身攻击)与后续讨论中说话者及其言论的仇恨、极端与毒性程度最低相关;讽刺手法尤其在中观层面的组织化极端群体场景中能改善讨论质量;而提供事实或揭露矛盾的建设性评论可能适得其反,反而引发更多极端言论;提及内群体或外群体通常导致话语质量恶化;鲜明的情绪基调(无论是愤怒、恐惧等负面情绪,还是热情、自豪等正面情绪)同样会带来负面效果。本研究超越传统小样本的单次分析,为通过公民集体 moderation 实现网络公共领域有效治理提供了实证依据。