The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death. Machine learning models have been developed to automatically detect abusive language, but these models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time. This study aims to investigate the nature and impact of temporal bias in abusive language detection across various languages and explore mitigation methods. We evaluate the performance of models on abusive data sets from different time periods. Our results demonstrate that temporal bias is a significant challenge for abusive language detection, with models trained on historical data showing a significant drop in performance over time. We also present an extensive linguistic analysis of these abusive data sets from a diachronic perspective, aiming to explore the reasons for language evolution and performance decline. This study sheds light on the pervasive issue of temporal bias in abusive language detection across languages, offering crucial insights into language evolution and temporal bias mitigation.
翻译:网络上辱骂性语言的使用已成为一个日益普遍的问题,对个人和社会均造成损害,其影响范围从心理伤害直至升级为现实生活中的暴力甚至死亡。人们已开发出机器学习模型来自动检测辱骂性语言,但这些模型可能遭受时间偏差的影响,即话题、语言使用或社会规范随时间变化的现象。本研究旨在探讨不同语言中辱骂性语言检测的时间偏差的性质与影响,并探索缓解方法。我们评估了模型在不同时间段辱骂性数据集上的表现。结果表明,时间偏差是辱骂性语言检测中的一项重大挑战,基于历史数据训练的模型在性能上会随时间显著下降。我们还从历时的角度对这些辱骂性数据集进行了广泛的 linguistic分析(语言分析),旨在探索语言演变和性能下降的原因。本研究揭示了跨语言辱骂性语言检测中普遍存在的时间偏差问题,为语言演变和缓解时间偏差提供了关键见解。