机器辅助写作评估：探索预训练语言模型在议论文语步分析中的应用 (Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves)

The study investigates the efficacy of pre-trained language models (PLMs) in analyzing argumentative moves in a longitudinal learner corpus. Prior studies on argumentative moves often rely on qualitative analysis and manual coding, limiting their efficiency and generalizability. The study aims to: 1) to assess the reliability of PLMs in analyzing argumentative moves; 2) to utilize PLM-generated annotations to illustrate developmental patterns and predict writing quality. A longitudinal corpus of 1643 argumentative texts from 235 English learners in China is collected and annotated into six move types: claim, data, counter-claim, counter-data, rebuttal, and non-argument. The corpus is divided into training, validation, and application sets annotated by human experts and PLMs. We use BERT as one of the implementations of PLMs. The results indicate a robust reliability of PLMs in analyzing argumentative moves, with an overall F1 score of 0.743, surpassing existing models in the field. Additionally, PLM-labeled argumentative moves effectively capture developmental patterns and predict writing quality. Over time, students exhibit an increase in the use of data and counter-claims and a decrease in non-argument moves. While low-quality texts are characterized by a predominant use of claims and data supporting only oneside position, mid- and high-quality texts demonstrate an integrative perspective with a higher ratio of counter-claims, counter-data, and rebuttals. This study underscores the transformative potential of integrating artificial intelligence into language education, enhancing the efficiency and accuracy of evaluating students' writing. The successful application of PLMs can catalyze the development of educational technology, promoting a more data-driven and personalized learning environment that supports diverse educational needs.

翻译：本研究探讨了预训练语言模型在纵向学习者语料库中分析议论文语步的有效性。以往关于议论文语步的研究多依赖于定性分析和人工编码，其效率和可推广性受到限制。本研究旨在：1) 评估PLMs分析议论文语步的可靠性；2) 利用PLM生成的标注来阐释发展模式并预测写作质量。我们收集了来自235名中国英语学习者的1643篇议论文文本构成的纵向语料库，并将其标注为六种语步类型：主张、论据、反主张、反论据、反驳和非论证。该语料库被划分为训练集、验证集和应用集，并分别由人类专家和PLMs进行标注。我们使用BERT作为PLMs的一种实现。结果表明，PLMs在分析议论文语步方面具有稳健的可靠性，总体F1分数达到0.743，超越了该领域的现有模型。此外，PLM标注的议论文语步能有效捕捉发展模式并预测写作质量。随着时间的推移，学生表现出对论据和反主张使用的增加，以及非论证语步的减少。低质量文本的特征是主要使用仅支持单方立场的主张和论据，而中高质量文本则展现出一种整合性视角，具有更高比例的反主张、反论据和反驳。本研究强调了将人工智能融入语言教育的变革潜力，能够提升评估学生写作的效率和准确性。PLMs的成功应用可以催化教育技术的发展，促进一个更加数据驱动和个性化的学习环境，以支持多样化的教育需求。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日