This paper focuses on the task of measuring and forecasting incivility in conversations following replies to hate speech. Identifying replies that steer conversations away from hatred and elicit civil follow-up conversations sheds light into effective strategies to engage with hate speech and proactively avoid further escalation. We propose new metrics that take into account various dimensions of antisocial and prosocial behaviors to measure the conversation incivility following replies to hate speech. Our best metric aligns with human perceptions better than prior work. Additionally, we present analyses on a) the language of antisocial and prosocial posts, b) the relationship between antisocial or prosocial posts and user interactions, and c) the language of replies to hate speech that elicit follow-up conversations with different incivility levels. We show that forecasting the incivility level of conversations following a reply to hate speech is a challenging task. We also present qualitative analyses to identify the most common errors made by our best model.
翻译:本文聚焦于测量和预测针对仇恨言论回复后对话中的不文明现象。识别那些能够引导对话远离仇恨并引发文明后续回复的策略,有助于揭示应对仇恨言论并主动避免事态升级的有效方法。我们提出了新的度量指标,该指标综合考虑反社会与亲社会行为的多个维度,以衡量针对仇恨言论回复后的对话不文明程度。我们提出的最佳指标与人类感知的契合度优于先前研究。此外,我们进行了以下分析:a) 反社会与亲社会帖文的语言特征,b) 反社会或亲社会帖文与用户互动之间的关系,c) 引发不同不文明程度后续对话的仇恨言论回复的语言特征。研究表明,预测针对仇恨言论回复后的对话不文明程度是一项具有挑战性的任务。我们还通过定性分析,识别了我们最佳模型所犯的最常见错误。