Grammatical error correction (GEC) systems are usually trained and evaluated on GEC benchmarks, but their performance often drops sharply once the surrounding context is slightly perturbed or extended. This indicates that the existing GEC models usually fail to understand the error patterns in the varying contexts. In this paper, we thoroughly investigate the counterfactuals for GEC tasks, where the subtle changes to the contexts could lead to the label flipping issue. We propose CoCoGEC, a counterfactual generation framework that creates copies of training instances with error-irrelevant contexts altered. Our framework systematically generates counterfactuals by (1) generating intra- and inter-sentence counterfactuals that maintain the error patterns as well as syntax of the original instances by altering the word-level and sentence-level contexts; (2) revising the generated counterfactuals by selecting the instances with flipped labels and high GEC Mutual Information (MI) coefficient. Extensive experiments show that our method substantially improves the stability of GEC models, outperforming a set of data augmentation baselines. Particularly, it could achieve absolute F0.5 gains of +9.9, +11.3, and +20.8 points on the perturbed BEA-19*,CoNLL-14*, and TEM-8* data set.Our code is released at https://github.com/Quinnok/CoCoGEC
翻译:语法纠错(GEC)系统通常在GEC基准数据集上进行训练和评估,但一旦周围上下文发生轻微扰动或扩展,其性能往往会急剧下降。这表明现有GEC模型通常无法理解变化语境中的错误模式。本文深入探究了GEC任务中的反事实问题,其中上下文的细微变化可能导致标签翻转现象。我们提出CoCoGEC,一种反事实生成框架,通过创建错误无关上下文被修改的训练实例副本来实现。该框架通过以下步骤系统性地生成反事实样本:(1)生成句内和句间反事实样本,通过改变词级和句级上下文来保持原始实例的错误模式及句法结构;(2)通过选择标签翻转且具有高GEC互信息(MI)系数的实例来修正生成的反事实样本。大量实验表明,我们的方法显著提升了GEC模型的稳定性,超越了一系列数据增强基线方法。特别地,该方法在扰动后的BEA-19*、CoNLL-14*和TEM-8*数据集上分别取得了+9.9、+11.3和+20.8个绝对F0.5分数提升。我们的代码已开源至 https://github.com/Quinnok/CoCoGEC。