In the era of generative artificial intelligence (AI), the fusion of large language models (LLMs) offers unprecedented opportunities for innovation in the field of modern education. We embark on an exploration of prompted LLMs within the context of educational and assessment applications to uncover their potential. Through a series of carefully crafted research questions, we investigate the effectiveness of prompt-based techniques in generating open-ended questions from school-level textbooks, assess their efficiency in generating open-ended questions from undergraduate-level technical textbooks, and explore the feasibility of employing a chain-of-thought inspired multi-stage prompting approach for language-agnostic multiple-choice question (MCQ) generation. Additionally, we evaluate the ability of prompted LLMs for language learning, exemplified through a case study in the low-resource Indian language Bengali, to explain Bengali grammatical errors. We also evaluate the potential of prompted LLMs to assess human resource (HR) spoken interview transcripts. By juxtaposing the capabilities of LLMs with those of human experts across various educational tasks and domains, our aim is to shed light on the potential and limitations of LLMs in reshaping educational practices.
翻译:在生成式人工智能时代,大型语言模型的融合为现代教育领域的创新提供了前所未有的机遇。我们针对教育与评估应用背景下的提示式大型语言模型展开探索,以揭示其潜力。通过一系列精心设计的研究问题,我们研究了基于提示的技术从中学教材中生成开放式问题的有效性,评估了其从本科技术教材中生成开放式问题的效率,并探索了采用思维链启发式多阶段提示方法进行语言无关的单项选择题生成的可行性。此外,我们通过低资源印度语言孟加拉语的案例研究,评估了提示式大型语言模型解释孟加拉语语法错误的能力,并进一步评价了其评估人力资源口语面试转录文本的潜力。通过将大型语言模型的能力与人类专家在各种教育任务和领域中的表现进行对比,我们旨在阐明大型语言模型在重塑教育实践中的潜力与局限性。