AI-powered language learning tools increasingly provide instant, personalised feedback to millions of learners worldwide. However, this feedback can fail in ways that are difficult for learners--and even teachers--to detect, potentially reinforcing misconceptions and eroding learning outcomes over extended use. We present a portion of L2-Bench, a benchmark for evaluating AI systems in language education that includes (but is not limited to) six critical dimensions of effective feedback: diagnostic accuracy, awareness of appropriacy, causes of error, prioritisation, guidance for improvement, and supporting self-regulation. We analyse how AI systems can fail with respect to these dimensions. These failures, which we argue are conducive to "explainability pitfalls," are AI-generated explanations that appear helpful on the surface but are fundamentally flawed, increasing the risk of attainment, human-AI interaction, and socioaffective harms. We discuss how the specific context of language learning amplifies these risks and outline open questions we believe merit more attention when designing evaluation frameworks specifically. Our analysis aims to expand the community's understanding of both the typology of explainability pitfalls and the contextual dynamics in which they may occur in order to encourage AI developers to better design safe, trustworthy, and effective AI explanations.
翻译:AI驱动的语言学习工具正日益为数百万全球学习者提供即时的个性化反馈。然而,这类反馈可能以学习者甚至教师难以察觉的方式失效,长期使用中可能强化误解、侵蚀学习成效。我们提出L2-Bench基准的一部分,该基准用于评估语言教育中的AI系统,涵盖(但不限于)有效反馈的六个关键维度:诊断准确性、恰当性意识、错误根源、优先级排序、改进指导以及支持自我调节。我们分析了AI系统在这些维度上的失败方式。这些失败被我们论证为易于引发“可解释性陷阱”——即表面看似有益但实质上存在根本缺陷的AI生成解释,从而增加成就风险、人机交互风险及社会情感伤害风险。我们讨论了语言学习的具体情境如何放大这些风险,并概述了在设计评估框架时我们认为值得更多关注的开放性问题。本分析旨在拓展学界对可解释性陷阱类型学及其可能发生的语境动态的理解,以鼓励AI开发者设计更安全、可信且有效的AI解释。