Multiple-choice reading and listening comprehension tests are an important part of language assessment. Content creators for standard educational tests need to carefully curate questions that assess the comprehension abilities of candidates taking the tests. However, recent work has shown that a large number of questions in general multiple-choice reading comprehension datasets can be answered without comprehension, by leveraging world knowledge instead. This work investigates how much of a contextual passage needs to be read in multiple-choice reading based on conversation transcriptions and listening comprehension tests to be able to work out the correct answer. We find that automated reading comprehension systems can perform significantly better than random with partial or even no access to the context passage. These findings offer an approach for content creators to automatically capture the trade-off between comprehension and world knowledge required for their proposed questions.
翻译:多项选择题型的阅读与听力理解测试是语言评估的重要组成部分。标准教育测试的内容创作者需要精心设计题目,以评估考生在考试中的理解能力。然而,近期研究表明,一般的多项选择阅读理解数据集中的大量题目无需理解上下文,仅凭世界知识即可作答。本研究探讨了在基于对话转录的多项选择阅读与听力理解测试中,需要阅读多少上下文内容才能推导出正确答案。我们发现,在部分甚至完全没有上下文的情况下,自动阅读理解系统的表现可显著优于随机水平。这些发现为内容创作者提供了一种方法,可自动捕捉其题目所需的阅读理解与世界知识之间的权衡关系。