Interpretability Framework for LLMs in Undergraduate Calculus

Large Language Models (LLMs) are increasingly being used in education, yet their correctness alone does not capture the quality, reliability, or pedagogical validity of their problem-solving behavior, especially in mathematics, where multistep logic, symbolic reasoning, and conceptual clarity are critical. Conventional evaluation methods largely focus on final answer accuracy and overlook the reasoning process. To address this gap, we introduce a novel interpretability framework for analyzing LLM-generated solutions using undergraduate calculus problems as a representative domain. Our approach combines reasoning flow extraction and decomposing solutions into semantically labeled operations and concepts with prompt ablation analysis to assess input salience and output stability. Using structured metrics such as reasoning complexity, phrase sensitivity, and robustness, we evaluated the model behavior on real Calculus I to III university exams. Our findings revealed that LLMs often produce syntactically fluent yet conceptually flawed solutions, with reasoning patterns sensitive to prompt phrasing and input variation. This framework enables fine-grained diagnosis of reasoning failures, supports curriculum alignment, and informs the design of interpretable AI-assisted feedback tools. This is the first study to offer a structured, quantitative, and pedagogically grounded framework for interpreting LLM reasoning in mathematics education, laying the foundation for the transparent and responsible deployment of AI in STEM learning environments.

翻译：大型语言模型（LLM）在教育领域的应用日益广泛，然而仅凭其答案的正确性不足以评估其解题行为的质量、可靠性或教学有效性，尤其是在数学领域——多步逻辑、符号推理和概念清晰性至关重要。传统的评估方法主要关注最终答案的准确性，而忽视了推理过程。为弥补这一不足，我们引入了一种新颖的可解释性框架，以本科微积分问题作为代表性领域，用于分析LLM生成的解答。我们的方法结合了推理流程提取、将解答分解为带语义标签的运算与概念，以及提示消融分析，以评估输入显著性和输出稳定性。通过使用推理复杂度、短语敏感性和鲁棒性等结构化指标，我们在真实的微积分I至III大学考试题目上评估了模型行为。我们的研究发现，LLM经常生成语法流畅但概念存在缺陷的解答，其推理模式对提示措辞和输入变化敏感。该框架能够对推理失败进行细粒度诊断，支持课程内容对齐，并为可解释的AI辅助反馈工具的设计提供依据。这是首个为数学教育中的LLM推理提供结构化、定量化且基于教学理论的可解释性框架的研究，为在STEM学习环境中透明、负责任地部署AI奠定了基础。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日