The study investigates the efficacy of pre-trained language models (PLMs) in analyzing argumentative moves in a longitudinal learner corpus. Prior studies on argumentative moves often rely on qualitative analysis and manual coding, limiting their efficiency and generalizability. The study aims to: 1) to assess the reliability of PLMs in analyzing argumentative moves; 2) to utilize PLM-generated annotations to illustrate developmental patterns and predict writing quality. A longitudinal corpus of 1643 argumentative texts from 235 English learners in China is collected and annotated into six move types: claim, data, counter-claim, counter-data, rebuttal, and non-argument. The corpus is divided into training, validation, and application sets annotated by human experts and PLMs. We use BERT as one of the implementations of PLMs. The results indicate a robust reliability of PLMs in analyzing argumentative moves, with an overall F1 score of 0.743, surpassing existing models in the field. Additionally, PLM-labeled argumentative moves effectively capture developmental patterns and predict writing quality. Over time, students exhibit an increase in the use of data and counter-claims and a decrease in non-argument moves. While low-quality texts are characterized by a predominant use of claims and data supporting only oneside position, mid- and high-quality texts demonstrate an integrative perspective with a higher ratio of counter-claims, counter-data, and rebuttals. This study underscores the transformative potential of integrating artificial intelligence into language education, enhancing the efficiency and accuracy of evaluating students' writing. The successful application of PLMs can catalyze the development of educational technology, promoting a more data-driven and personalized learning environment that supports diverse educational needs.
翻译:本研究探讨了预训练语言模型在纵向学习者语料库中分析议论文语步的有效性。以往关于议论文语步的研究多依赖于定性分析和人工编码,其效率和可推广性受到限制。本研究旨在:1) 评估PLMs分析议论文语步的可靠性;2) 利用PLM生成的标注来阐释发展模式并预测写作质量。我们收集了来自235名中国英语学习者的1643篇议论文文本构成的纵向语料库,并将其标注为六种语步类型:主张、论据、反主张、反论据、反驳和非论证。该语料库被划分为训练集、验证集和应用集,并分别由人类专家和PLMs进行标注。我们使用BERT作为PLMs的一种实现。结果表明,PLMs在分析议论文语步方面具有稳健的可靠性,总体F1分数达到0.743,超越了该领域的现有模型。此外,PLM标注的议论文语步能有效捕捉发展模式并预测写作质量。随着时间的推移,学生表现出对论据和反主张使用的增加,以及非论证语步的减少。低质量文本的特征是主要使用仅支持单方立场的主张和论据,而中高质量文本则展现出一种整合性视角,具有更高比例的反主张、反论据和反驳。本研究强调了将人工智能融入语言教育的变革潜力,能够提升评估学生写作的效率和准确性。PLMs的成功应用可以催化教育技术的发展,促进一个更加数据驱动和个性化的学习环境,以支持多样化的教育需求。