Most research on hate speech detection has focused on English where a sizeable amount of labeled training data is available. However, to expand hate speech detection into more languages, approaches that require minimal training data are needed. In this paper, we test whether natural language inference (NLI) models which perform well in zero- and few-shot settings can benefit hate speech detection performance in scenarios where only a limited amount of labeled data is available in the target language. Our evaluation on five languages demonstrates large performance improvements of NLI fine-tuning over direct fine-tuning in the target language. However, the effectiveness of previous work that proposed intermediate fine-tuning on English data is hard to match. Only in settings where the English training data does not match the test domain, can our customised NLI-formulation outperform intermediate fine-tuning on English. Based on our extensive experiments, we propose a set of recommendations for hate speech detection in languages where minimal labeled training data is available.
翻译:大多数关于仇恨言论检测的研究集中在英语上,因为该语言中拥有大量已标注的训练数据。然而,为了将仇恨言论检测扩展到更多语言,需要能够以最少训练数据运行的方法。在本文中,我们测试了在零样本和少样本场景中表现良好的自然语言推理(NLI)模型是否能在目标语言仅提供少量标注数据时提升仇恨言论检测的性能。我们在五种语言上的评估结果表明,与在目标语言上直接微调相比,采用NLI微调的方法能显著提升性能。然而,以往研究中提出的在英语数据上进行中间微调的方法的效果难以被超越。仅当英语训练数据与测试领域不匹配时,我们定制的NLI公式才能优于在英语上的中间微调方法。基于大量实验,我们为标注训练数据极少的语言的仇恨言论检测提出了一系列建议。