This study demonstrates the effectiveness of XLNet, a transformer-based language model, for annotating argumentative elements in persuasive essays. XLNet's architecture incorporates a recurrent mechanism that allows it to model long-term dependencies in lengthy texts. Fine-tuned XLNet models were applied to three datasets annotated with different schemes - a proprietary dataset using the Annotations for Revisions and Reflections on Writing (ARROW) scheme, the PERSUADE corpus, and the Argument Annotated Essays (AAE) dataset. The XLNet models achieved strong performance across all datasets, even surpassing human agreement levels in some cases. This shows XLNet capably handles diverse annotation schemes and lengthy essays. Comparisons between the model outputs on different datasets also revealed insights into the relationships between the annotation tags. Overall, XLNet's strong performance on modeling argumentative structures across diverse datasets highlights its suitability for providing automated feedback on essay organization.
翻译:本研究展示了基于Transformer的语言模型XLNet在议论文论证元素标注中的有效性。XLNet的架构引入了循环机制,使其能够对长文本中的长期依赖关系进行建模。我们将微调后的XLNet模型应用于三个采用不同标注方案的数据集:基于写作修订与反思标注方案(ARROW)的专有数据集、PERSUADE语料库以及论证标注论文数据集(AAE)。XLNet模型在所有数据集上均取得了优异的表现,甚至在某些场景下超越了人类标注一致性水平。这证明了XLNet能够有效处理多样化的标注方案及长篇文本。通过对不同数据集上模型输出的比较分析,还揭示了标注标签之间的关联关系。总体而言,XLNet在跨数据集论证结构建模中的卓越表现,凸显了其为议论文组织提供自动化反馈的适用性。