Most AI-based educational tools today adopt a one-on-one tutoring paradigm, pairing a single LLM with a single learner. Yet decades of learning science research suggest that multi-party interaction -- through peer modeling, co-construction, and exposure to diverse perspectives -- can produce learning benefits that dyadic tutoring alone cannot. In this paper, we investigate whether multi-agent LLM configurations can enhance learning outcomes beyond what a single LLM tutor provides. We present two controlled experiments spanning distinct learning contexts. In a convergent problem-solving study ($N=315$), participants tackle SAT-level math problems in a 2$\times$2 design that varies the presence of an LLM tutor and LLM peers, each making different kinds of errors (conceptual vs.\ arithmetic); participants who interacted with both a tutor and peers achieved the highest unassisted test accuracy. In a divergent composition study ($N=247$), participants write argumentative and creative essays with either no AI assistance, a single LLM (Claude or ChatGPT), or both Claude and ChatGPT together; while both LLM conditions improved essay quality, only the two-agent condition avoided the idea-level homogeneity that single-model assistance was found to produce. Together, these studies offer one of the first controlled investigations of multi-agent LLM learning environments, probing whether the move from one-on-one AI tutoring toward richer agent configurations can unlock the collaborative and observational benefits long documented in human social learning research.
翻译:当前大多数基于AI的教育工具采用一对一辅导模式,将单个大语言模型与单个学习者配对。然而,数十年的学习科学研究表明,通过同伴建模、共同建构和接触多元视角的多方互动,能产生二元辅导模式无法独立实现的学习收益。本文探讨了多智能体大语言模型配置能否超越单一LLM导师的学习效果。我们通过两项控制实验研究了不同学习情境。在聚合型问题解决研究(N=315)中,参与者采用2×2实验设计完成SAT级数学题,该设计调整了LLM导师与LLM同伴的存在性——这些同伴分别呈现概念性错误与算术性错误;与导师及同伴均互动的参与者在无辅助测试中取得了最高准确率。在发散型写作研究(N=247)中,参与者分别在没有AI辅助、使用单一LLM(Claude或ChatGPT)、同时使用Claude与ChatGPT三种条件下撰写议论文和创意文;虽然两种LLM条件均提升了文章质量,但只有双智能体条件避免了单一模型辅助导致的概念层面的同质化现象。这两项研究首次通过控制实验系统比较了多智能体LLM学习环境,探究从一对一AI辅导向更丰富的智能体配置转变时,能否释放人类社会学习研究中长期证实的协作与观察效应。