Automated Essay Scoring (AES) plays a crucial role in education by providing scalable and efficient assessment tools. However, in real-world settings, the extreme scarcity of labeled data severely limits the development and practical adoption of robust AES systems. This study proposes a novel approach to enhance AES performance in both limited-data and full-data settings by introducing three key techniques. First, we introduce a Two-Stage fine-tuning strategy that leverages low-rank adaptations to better adapt an AES model to target prompt essays. Second, we introduce a Score Alignment technique to improve consistency between predicted and true score distributions. Third, we employ uncertainty-aware self-training using unlabeled data, effectively expanding the training set with pseudo-labeled samples while mitigating label noise propagation. We implement above three key techniques on DualBERT. We conduct extensive experiments on the ASAP++ dataset. As a result, in the 32-data setting, all three key techniques improve performance, and their integration achieves 91.2% of the full-data performance trained on approximately 1,000 labeled samples. In addition, the proposed Score Alignment technique consistently improves performance in both limited-data and full-data settings: e.g., it achieves state-of-the-art results in the full-data setting when integrated into DualBERT.
翻译:自动作文评分(AES)通过提供可扩展且高效的评估工具,在教育领域发挥着关键作用。然而,在实际应用场景中,标注数据的极度稀缺严重限制了鲁棒AES系统的开发与实际采用。本研究提出一种新颖方法,通过引入三项关键技术,在有限数据和全数据两种设定下提升AES性能。首先,我们提出一种两阶段微调策略,利用低秩自适应技术使AES模型更好地适应目标命题作文。其次,我们引入分数对齐技术以提升预测分数与真实分数分布间的一致性。第三,我们采用基于不确定性的自训练方法利用未标注数据,通过伪标注样本有效扩展训练集,同时缓解标签噪声传播。我们在DualBERT模型上实现了上述三项关键技术,并在ASAP++数据集上进行了大量实验。结果表明,在32条数据的设定下,所有三项技术均能提升性能,其集成方案在使用约1000条标注样本训练时达到了全数据性能的91.2%。此外,所提出的分数对齐技术在有限数据与全数据设定下均能持续提升性能:例如,当集成至DualBERT时,该技术在全数据设定下取得了最先进的成果。