Simultaneous machine translation (SiMT) models are trained to strike a balance between latency and translation quality. However, training these models to achieve high quality while maintaining low latency often leads to a tendency for aggressive anticipation. We argue that such issue stems from the autoregressive architecture upon which most existing SiMT models are built. To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism. We enable NAST to generate the blank token or repetitive tokens to adjust its READ/WRITE strategy flexibly, and train it to maximize the non-monotonic latent alignment with an alignment-based latency loss. Experiments on various SiMT benchmarks demonstrate that NAST outperforms previous strong autoregressive SiMT baselines.
翻译:同声机器翻译(SiMT)模型的训练旨在平衡延迟与翻译质量。然而,为在保持低延迟的同时实现高质量翻译而训练这些模型,往往会导致其倾向于激进的预测。我们认为,这一问题源于大多数现有SiMT模型所采用的自回归架构。为解决这些问题,我们提出了非自回归流式Transformer(NAST),其包含一个单向编码器和一个具有块内并行性的非自回归解码器。我们使NAST能够生成空白标记或重复标记,以灵活调整其读/写策略,并通过基于对齐的延迟损失函数训练它,以最大化非单调潜在对齐。在多个SiMT基准测试上的实验表明,NAST优于此前强自回归SiMT基线模型。