Grammatical Error Correction (GEC) systems play a vital role in assisting people with their daily writing tasks. However, users may sometimes come across a GEC system that initially performs well but fails to correct errors when the inputs are slightly modified. To ensure an ideal user experience, a reliable GEC system should have the ability to provide consistent and accurate suggestions when encountering irrelevant context perturbations, which we refer to as context robustness. In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems. RobustGEC comprises 5,000 GEC cases, each with one original error-correct sentence pair and five variants carefully devised by human annotators. Utilizing RobustGEC, we reveal that state-of-the-art GEC systems still lack sufficient robustness against context perturbations. In addition, we propose a simple yet effective method for remitting this issue.
翻译:语法纠错(GEC)系统在辅助用户日常写作中发挥着关键作用。然而,用户有时会遇到这样的情形:一个GEC系统最初表现良好,但当输入语句出现轻微改动时却无法正确纠错。为确保理想的用户体验,可靠的GEC系统应具备在遭遇无关上下文扰动时提供一致且准确建议的能力,我们将此称为上下文鲁棒性。本文提出RobustGEC基准测试,旨在评估GEC系统的上下文鲁棒性。RobustGEC包含5,000组GEC案例,每组含一个原始纠错句对及由人工标注者精心设计的五个变体。利用RobustGEC,我们发现当前最先进的GEC系统仍缺乏足够的上下文扰动鲁棒性。此外,我们提出一种简单有效的方法来缓解这一问题。