Coherence in writing, an aspect that second-language (L2) English learners often struggle with, is crucial in assessing L2 English writing. Existing automated writing evaluation systems primarily use basic surface linguistic features to detect coherence in writing. However, little effort has been made to correct the detected incoherence, which could significantly benefit L2 language learners seeking to improve their writing. To bridge this gap, we introduce DECOR, a novel benchmark that includes expert annotations for detecting incoherence in L2 English writing, identifying the underlying reasons, and rewriting the incoherent sentences. To our knowledge, DECOR is the first coherence assessment dataset specifically designed for improving L2 English writing, featuring pairs of original incoherent sentences alongside their expert-rewritten counterparts. Additionally, we fine-tuned models to automatically detect and rewrite incoherence in student essays. We find that incorporating specific reasons for incoherence during fine-tuning consistently improves the quality of the rewrites, achieving a result that is favored in both automatic and human evaluations.
翻译:写作连贯性是二语英语学习者普遍面临的难点,也是评估二语英语写作水平的关键维度。现有的自动化写作评价系统主要依赖表层语言特征来检测文本连贯性,然而针对检测到的不连贯现象进行修正的研究尚显不足,而这恰恰是寻求提升写作能力的二语学习者亟需的辅助功能。为填补这一空白,我们提出了DECOR——一个包含专家标注的新型基准数据集,涵盖二语英语写作中的不连贯现象检测、成因分析及不连贯句子的重写修正。据我们所知,DECOR是首个专门针对提升二语英语写作能力设计的连贯性评估数据集,其核心特征在于提供了原始不连贯句子与专家重写版本的配对样本。此外,我们通过微调模型实现了对学生作文中不连贯现象的自动检测与重写。研究发现,在微调过程中引入不连贯的具体成因信息能持续提升重写质量,该策略在自动评估与人工评估中均获得显著优势。