Non-autoregressive neural machine translation (NAT) models are proposed to accelerate the inference process while maintaining relatively high performance. However, existing NAT models are difficult to achieve the desired efficiency-quality trade-off. For one thing, fully NAT models with efficient inference perform inferior to their autoregressive counterparts. For another, iterative NAT models can, though, achieve comparable performance while diminishing the advantage of speed. In this paper, we propose RenewNAT, a flexible framework with high efficiency and effectiveness, to incorporate the merits of fully and iterative NAT models. RenewNAT first generates the potential translation results and then renews them in a single pass. It can achieve significant performance improvements at the same expense as traditional NAT models (without introducing additional model parameters and decoding latency). Experimental results on various translation benchmarks (e.g., \textbf{4} WMT) show that our framework consistently improves the performance of strong fully NAT methods (e.g., GLAT and DSLP) without additional speed overhead.
翻译:非自回归神经机器翻译(NAT)模型旨在加速推理过程的同时保持较高性能。然而,现有NAT模型难以实现理想的效率-质量平衡。一方面,具有高效推理能力的完全NAT模型性能逊色于其自回归对应模型。另一方面,迭代NAT模型虽能达到可比性能,却削弱了速度优势。本文提出RenewNAT——一个兼具高效率和有效性的灵活框架,融合了完全NAT模型与迭代NAT模型的优点。RenewNAT首先生成潜在翻译结果,随后通过单次遍历对其进行更新。它能在与传统NAT模型相同的开销下(不引入额外模型参数和解码延迟)实现显著的性能提升。在多个翻译基准(例如4个WMT数据集)上的实验结果表明,我们的框架在不增加速度开销的情况下持续提升了强完全NAT方法(如GLAT和DSLP)的性能。