Generating adversarial examples contributes to mainstream neural machine translation~(NMT) robustness. However, popular adversarial policies are apt for fixed tokenization, hindering its efficacy for common character perturbations involving versatile tokenization. Based on existing adversarial generation via reinforcement learning~(RL), we propose the `DexChar policy' that introduces character perturbations for the existing mainstream adversarial policy based on token substitution. Furthermore, we improve the self-supervised matching that provides feedback in RL to cater to the semantic constraints required during training adversaries. Experiments show that our method is compatible with the scenario where baseline adversaries fail, and can generate high-efficiency adversarial examples for analysis and optimization of the system.
翻译:生成对抗样本有助于提升主流神经机器翻译(NMT)系统的鲁棒性。然而,流行的对抗策略通常适用于固定标记化方案,这限制了其在涉及多样化标记化的常见字符扰动场景中的有效性。基于现有通过强化学习(RL)生成对抗样本的方法,我们提出了“DexChar策略”,该策略在现有基于标记替换的主流对抗策略基础上引入了字符扰动。此外,我们改进了在强化学习中提供反馈的自监督匹配机制,以适应训练对抗样本时所需的语义约束。实验表明,我们的方法能够兼容基线对抗策略失效的场景,并能生成高效的对抗样本,用于系统分析与优化。