Automated Scoring for Reading Comprehension via In-context BERT Tuning

from arxiv, Published as a conference paper at AIED 2022. A grand prize-winner for the NAEP AS Challenge. Code available at: https://github.com/ni9elf/automated-scoring

Automated scoring of open-ended student responses has the potential to significantly reduce human grader effort. Recent advances in automated scoring often leverage textual representations based on pre-trained language models such as BERT and GPT as input to scoring models. Most existing approaches train a separate model for each item/question, which is suitable for scenarios such as essay scoring where items can be quite different from one another. However, these approaches have two limitations: 1) they fail to leverage item linkage for scenarios such as reading comprehension where multiple items may share a reading passage; 2) they are not scalable since storing one model per item becomes difficult when models have a large number of parameters. In this paper, we report our (grand prize-winning) solution to the National Assessment of Education Progress (NAEP) automated scoring challenge for reading comprehension. Our approach, in-context BERT fine-tuning, produces a single shared scoring model for all items with a carefully-designed input structure to provide contextual information on each item. We demonstrate the effectiveness of our approach via local evaluations using the training dataset provided by the challenge. We also discuss the biases, common error types, and limitations of our approach.

翻译：开放性学生回答的自动评分有潜力显著减少人工评分者的工作量。近期自动评分的进展通常利用基于预训练语言模型（如BERT和GPT）的文本表征作为评分模型的输入。现有方法大多为每个题目/问题单独训练模型，这种方法适用于题目间差异较大的场景（如作文评分）。然而，这些方法存在两个局限：1）在阅读理解等共享同一阅读篇章的多个题目场景中，未能利用题目间的关联性；2）可扩展性差，当模型参数规模较大时，为每个题目单独存储模型变得困难。本文报告了我们在国家教育进步评估（NAEP）阅读理解自动评分挑战中获得的（大奖）解决方案。我们的方法——基于上下文BERT微调——通过精心设计的输入结构为每个题目提供上下文信息，生成了适用于所有题目的统一评分模型。我们使用挑战提供的训练数据集通过本地评估验证了该方法的有效性，并讨论了方法的偏差、常见错误类型及其局限性。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日