In educational applications, LLMs exhibit several fundamental pedagogical limitations, such as their tendency to reveal solutions rather than support dialogic learning. We introduce ConvoLearn (https://huggingface.co/datasets/masharma/convolearn ), a dataset grounded in knowledge building theory that operationalizes six core pedagogical dimensions: cognitive engagement, formative assessment, accountability, cultural responsiveness, metacognition, and power dynamics. We construct a semi-synthetic dataset of 1250 tutor-student dialogues (20 turns each) in middle school Earth Science through controlled interactions between human teachers and a simulated student. Using QLoRA, we demonstrate that training on this dataset meaningfully shifts LLM behavior toward knowledge-building strategies. Human evaluation by 31 teachers shows our fine-tuned Mistral 7B (M = 4.10, SD = 1.03) significantly outperforms both its base version (M = 2.59, SD = 1.11) and Claude Sonnet 4.5 (M = 2.87, SD = 1.29) overall. This work establishes a potential framework to guide future development and evaluation of constructivist AI tutors.
翻译:在教育应用中,大语言模型(LLMs)表现出若干根本性的教学局限性,例如其倾向于直接给出解决方案而非支持对话式学习。我们介绍了ConvoLearn(https://huggingface.co/datasets/masharma/convolearn),这是一个基于知识建构理论构建的数据集,它将六个核心教学维度操作化:认知参与、形成性评估、责任担当、文化响应性、元认知和权力动态。我们通过人类教师与模拟学生之间的受控交互,构建了一个包含1250个中学地球科学领域导师-学生对话(每个对话20轮)的半合成数据集。利用QLoRA,我们证明了在此数据集上进行训练能有效促使LLM的行为向知识建构策略转变。由31名教师进行的人工评估显示,我们微调后的Mistral 7B模型(M = 4.10,SD = 1.03)在整体表现上显著优于其基础版本(M = 2.59,SD = 1.11)和Claude Sonnet 4.5(M = 2.87,SD = 1.29)。这项工作为未来建构主义AI导师的开发和评估建立了一个潜在的指导框架。