Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue of Neural Machine Translation (NMT) generating translation outputs that contain toxic words not present in the input. The objective is to mitigate the introduction of toxic language without the need for re-training. In the case of identified added toxicity during the inference process, ReSeTOX dynamically adjusts the key-value self-attention weights and re-evaluates the beam search hypotheses. Experimental results demonstrate that ReSeTOX achieves a remarkable 57% reduction in added toxicity while maintaining an average translation quality of 99.5% across 164 languages.
翻译:我们提出的方法ReSeTOX(REdo SEarch if TOXic,若有毒则重新搜索)解决了神经机器翻译(NMT)生成包含输入中不存在的有毒词汇的翻译输出问题。其目标是在无需重新训练的情况下减轻有毒语言的引入。在推理过程中检测到新增毒性时,ReSeTOX动态调整键值自注意力权重并重新评估束搜索假设。实验结果表明,ReSeTOX在164种语言上实现了新增毒性降低57%的显著效果,同时平均翻译质量维持在99.5%。