With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can "counter" the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network. However, most of the efforts so far have been primarily focused on English. To bridge the gap for low-resource languages such as Bengali and Hindi, we create a benchmark dataset of 5,062 abusive speech/counterspeech pairs, of which 2,460 pairs are in Bengali and 2,602 pairs are in Hindi. We implement several baseline models considering various interlingual transfer mechanisms with different configurations to generate suitable counterspeech to set up an effective benchmark. We observe that the monolingual setup yields the best performance. Further, using synthetic transfer, language models can generate counterspeech to some extent; specifically, we notice that transferability is better when languages belong to the same language family.
翻译:随着网络虐待言论的泛滥,自然语言处理学界开始探索利用神经架构生成反制言语(counterspeech),以"抑制"此类侮辱性言论的恶意基调,并削弱/改善其在社交网络上的扩散效应。然而,现有研究主要聚焦于英语。为弥补孟加拉语和印地语等低资源语言的空白,我们构建了一个包含5,062对侮辱性言语/反制言语的基准数据集,其中孟加拉语对2,460对,印地语对2,602对。我们基于多种跨语言迁移机制的不同配置实现了多个基线模型,以生成合适的反制言语并建立有效基准。实验发现,单语设置取得最佳性能。此外,通过合成迁移,语言模型可在一定程度上生成反制言语;具体而言,我们观察到,当语言属于同一语系时,迁移性更优。