The Counter Narrative (CN) is a promising approach to combat online hate speech (HS) without infringing on freedom of speech. In recent years, there has been a growing interest in automatically generating CNs using natural language generation techniques. However, current automatic CN generation methods mainly rely on expert-authored datasets for training, which are time-consuming and labor-intensive to acquire. Furthermore, these methods cannot directly obtain and extend counter-knowledge from external statistics, facts, or examples. To address these limitations, we propose Retrieval-Augmented Unsupervised Counter Narrative Generation (RAUCG) to automatically expand external counter-knowledge and map it into CNs in an unsupervised paradigm. Specifically, we first introduce an SSF retrieval method to retrieve counter-knowledge from the multiple perspectives of stance consistency, semantic overlap rate, and fitness for HS. Then we design an energy-based decoding mechanism by quantizing knowledge injection, countering and fluency constraints into differentiable functions, to enable the model to build mappings from counter-knowledge to CNs without expert-authored CN data. Lastly, we comprehensively evaluate model performance in terms of language quality, toxicity, persuasiveness, relevance, and success rate of countering HS, etc. Experimental results show that RAUCG outperforms strong baselines on all metrics and exhibits stronger generalization capabilities, achieving significant improvements of +2.0% in relevance and +4.5% in success rate of countering metrics. Moreover, RAUCG enabled GPT2 to outperform T0 in all metrics, despite the latter being approximately eight times larger than the former. Warning: This paper may contain offensive or upsetting content!
翻译:反叙事是一种在不侵犯言论自由的前提下对抗在线仇恨言论的有效方法。近年来,利用自然语言生成技术自动生成反叙事的研究日益受到关注。然而,当前自动反叙事生成方法主要依赖专家编写的数据集进行训练,而这类数据集的获取耗时费力。此外,这些方法无法直接从外部统计数据、事实或示例中获取并扩展反驳性知识。为解决上述局限,我们提出检索增强无监督反叙事生成方法,通过无监督范式自动扩展外部反驳性知识并将其映射为反叙事。具体而言,我们首先提出SSF检索方法,从立场一致性、语义重叠率与仇恨言论适应度等多个视角检索反驳性知识;随后设计基于能量的解码机制,通过将知识注入、反驳性与流畅性约束量化为可微函数,使模型能够在无需专家编写的反叙事数据的情况下建立从反驳性知识到反叙事的映射;最后,我们从语言质量、毒性、说服力、相关性及仇恨言论反驳成功率等维度全面评估模型性能。实验结果表明,RAUCG在所有指标上均优于强基线模型,并展现出更强的泛化能力,其中相关性指标提升+2.0%,反驳成功率指标提升+4.5%。此外,RAUCG使GPT2在所有指标上超越参数规模约为其八倍的T0模型。警告:本文可能包含冒犯性或令人不适的内容!