Text detoxification is the task of transferring the style of text from toxic to neutral. While here are approaches yielding promising results in monolingual setup, e.g., (Dale et al., 2021; Hallinan et al., 2022), cross-lingual transfer for this task remains a challenging open problem (Moskovskiy et al., 2022). In this work, we present a large-scale study of strategies for cross-lingual text detoxification -- given a parallel detoxification corpus for one language; the goal is to transfer detoxification ability to another language for which we do not have such a corpus. Moreover, we are the first to explore a new task where text translation and detoxification are performed simultaneously, providing several strong baselines for this task. Finally, we introduce new automatic detoxification evaluation metrics with higher correlations with human judgments than previous benchmarks. We assess the most promising approaches also with manual markup, determining the answer for the best strategy to transfer the knowledge of text detoxification between languages.
翻译:文本去毒化是将文本风格从有害转为中立的任务。尽管在单语设定下已有一些方法取得令人瞩目的成果(如Dale等,2021;Hallinan等,2022),但该任务的跨语言迁移仍然是一个具有挑战性的开放问题(Moskovskiy等,2022)。本文对跨语言文本去毒化的策略进行了大规模研究——给定一种语言的平行去毒化语料库;目标是将去毒化能力迁移到另一种缺乏此类语料库的语言中。此外,我们首次探索了文本翻译与去毒化同步执行的新任务,并为此提供了多个强基线。最后,我们引入了新的自动去毒化评估指标,其与人工判断的相关性优于先前基准。我们还通过人工标注评估了最具前景的方法,确定了跨语言文本去毒化知识迁移的最佳策略答案。