Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network that outputs a self-attention weight matrix that can be used in beam search to find the best permutation of input tokens (with auxiliary {ins} tokens) and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens. This allows us to find the token permutation after only one forward pass of the permutation network, avoiding autoregressive constructions. We show that the resulting network improves over previously known non-autoregressive methods for GEC and reaches the level of autoregressive methods that do not use language-specific synthetic data generation methods. Our results are supported by a comprehensive experimental validation on the ConLL-2014 and Write&Improve+LOCNESS datasets and an extensive ablation study that supports our architectural and algorithmic choices.
翻译:语法错误纠正(GEC)是一项重要的自然语言处理任务,目前通常采用自回归序列到序列模型来解决。然而,这类方法由于逐词生成标记的机制,固有地存在速度较慢的问题,因此需要非自回归的替代方案。在本工作中,我们提出了一种新颖的非自回归GEC方法,该方法将架构解耦为两部分:一个排列网络,用于输出自注意力权重矩阵,该矩阵可在束搜索中用于寻找输入标记(包括辅助{ins}标记)的最佳排列;以及一个基于逐步展开去噪自编码器的解码器网络,用于填充特定标记。这使得我们仅需一次排列网络的前向传播即可找到标记排列,从而避免了自回归结构。我们证明,所提出的网络相较于先前已知的非自回归GEC方法取得了改进,并达到了未使用语言特定合成数据生成方法的自回归方法的水平。我们的结果得到了在ConLL-2014和Write&Improve+LOCNESS数据集上的全面实验验证,以及一项广泛消融研究的支持,该研究验证了我们的架构和算法选择。