LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction

Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble. Specifically, we train the model on an over-correction dataset constructed through the proposed K-fold cross inference method, which allows it to directly generate filtered sentences by combining the original and the over-corrected text. In the inference stage, we directly take the original sentences and the output results of other systems as input and then obtain the filtered sentences through LM-Combiner. Experiments on the FCGEC dataset show that our proposed method effectively alleviates the over-correction of the original system (+18.2 Precision) while ensuring the error recall remains unchanged. Besides, we find that LM-Combiner still has a good rewriting performance even with small parameters and few training data, and thus can cost-effectively mitigate the over-correction of black-box GEC systems (e.g., ChatGPT).

翻译：过度纠错是中文语法纠错（CGEC）任务中的关键问题。近期基于投票的模型集成方法能有效缓解过度纠错并提升GEC系统的精确率，但这些方法仍需多个GEC系统的输出，且不可避免会导致错误召回率下降。为此，我们提出LM-Combiner，一种无需模型集成即可直接修正GEC系统输出中过度纠错的改写模型。具体而言，我们通过所提出的K折交叉推理方法构建过度纠错数据集，并在此数据集上训练模型，使其能通过结合原始文本与过度纠错文本直接生成过滤后的句子。在推理阶段，我们将原始句子与其他系统的输出结果作为输入，通过LM-Combiner获得过滤文本。在FCGEC数据集上的实验表明，本方法在确保错误召回率不变的前提下，有效缓解了原始系统的过度纠错问题（精确率提升18.2%）。此外，我们发现即使参数量较小、训练数据有限，LM-Combiner仍具有良好的改写性能，因而能以低成本有效缓解黑箱GEC系统（如ChatGPT）的过度纠错现象。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日