Large language models (LLMs) are flexible, personalizable, and available, which makes their use within Intelligent Tutoring Systems (ITSs) appealing. However, that flexibility creates risks: inaccuracies, harmful content, and non-curricular material. Ethically deploying LLM-backed ITS systems requires designing safeguards that ensure positive experiences for students. We describe the design of a conversational system integrated into an ITS, and our experience evaluating its safety with red-teaming, an in-classroom usability test, and field deployment. We present empirical data from more than 8,000 student conversations with this system, finding that GPT-3.5 rarely generates inappropriate messages. Comparatively more common is inappropriate messages from students, which prompts us to reason about safeguarding as a content moderation and classroom management problem. The student interaction behaviors we observe provide implications for designers - to focus on student inputs as a content moderation problem - and implications for researchers - to focus on subtle forms of bad content.
翻译:大型语言模型(LLMs)具有灵活性、可个性化及高可用性等特点,使其在智能辅导系统(ITSs)中的应用颇具吸引力。然而,这种灵活性也带来了风险:包括信息不准确、有害内容及非课程材料。从伦理角度部署基于LLM的ITS系统需要设计保障机制,以确保学生获得积极的学习体验。本文描述了集成于ITS中的对话系统设计,以及通过红队测试、课堂可用性测试和实际部署评估其安全性的实践经验。我们展示了来自超过8,000次学生对话的实证数据,发现GPT-3.5极少生成不当信息。相对更常见的是学生发送的不当信息,这促使我们将保障机制视为内容审核与课堂管理问题加以考量。我们观察到的学生交互行为为设计者提供了启示——应将学生输入作为内容审核的核心问题;同时为研究者指明了方向——应关注隐蔽形式的不良内容。