Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting.
翻译:问题生成(QG)是自然语言处理领域的一项任务,在教育领域具有众多潜在优势和应用场景。为实现这一潜力,问题生成系统必须以教学需求为导向进行设计和验证。然而,目前鲜有研究基于真实教师或学生的反馈来评估或设计问题生成方法。本文提出一种基于大语言模型的问题生成方法,通过布鲁姆分类法设定学习目标来生成问题。我们通过多项实验评估教师在实际教学中如何运用自动生成的这些问题。结果表明,教师更倾向于使用自动生成的问题编写测验,且此类测验质量与手写版本相比无显著差异。此外,多项指标显示,自动生成的问题甚至能提升测验质量,展现了问题生成在课堂环境中大规模应用的前景。