Reliable automatic hate speech (HS) detection systems must adapt to the in-flow of diverse new data to curtail hate speech. However, hate speech detection systems commonly lack generalizability in identifying hate speech dissimilar to data used in training, impeding their robustness in real-world deployments. In this work, we propose a hate speech generalization framework that leverages emotion knowledge in a multitask architecture to improve the generalizability of hate speech detection in a cross-domain setting. We investigate emotion corpora with varying emotion categorical scopes to determine the best corpus scope for supplying emotion knowledge to foster generalized hate speech detection. We further assess the relationship between using pretrained Transformers models adapted for hate speech and its effect on our emotion-enriched hate speech generalization model. We perform extensive experiments on six publicly available datasets sourced from different online domains and show that our emotion-enriched HS detection generalization method demonstrates consistent generalization improvement in cross-domain evaluation, increasing generalization performance up to 18.1% and average cross-domain performance up to 8.5%, according to the F1 measure.
翻译:可靠的自动仇恨言论检测系统必须适应多样化新数据的流入以遏制仇恨言论。然而,仇恨言论检测系统在识别与训练数据不相似的仇恨言论时普遍缺乏泛化能力,这阻碍了其在实际部署中的鲁棒性。本研究提出一种基于多任务架构的仇恨言论泛化框架,通过利用情感知识提升跨域场景下仇恨言论检测的泛化能力。我们探究了具有不同情感范畴范围的情感语料库,以确定最能提供情感知识以促进泛化仇恨言论检测的最佳语料库范畴。进一步评估了使用针对仇恨言论适配的预训练Transformer模型对情感增强型仇恨言论泛化模型的影响。我们在六个来自不同在线领域的公开数据集上进行了大量实验,结果表明,我们提出的情感增强型仇恨言论检测泛化方法在跨域评估中持续展现出泛化性能的提升,F1值最高提升18.1%,平均跨域性能提升8.5%。