Digital educational environments are expanding toward complex AI and human discourse, providing researchers with an abundance of data that offers deep insights into learning and instructional processes. However, traditional qualitative analysis remains a labor-intensive bottleneck, severely limiting the scale at which this research can be conducted. We present Sandpiper, a mixed-initiative system designed to serve as a bridge between high-volume conversational data and human qualitative expertise. By tightly coupling interactive researcher dashboards with agentic Large Language Model (LLM) engines, the platform enables scalable analysis without sacrificing methodological rigor. Sandpiper addresses critical barriers to AI adoption in education by implementing context-aware, automated de-identification workflows supported by secure, university-housed infrastructure to ensure data privacy. Furthermore, the system employs schema-constrained orchestration to eliminate LLM hallucinations and enforces strict adherence to qualitative codebooks. An integrated evaluations engine allows for the continuous benchmarking of AI performance against human labels, fostering an iterative approach to model refinement and validation. We propose a user study to evaluate the system's efficacy in improving research efficiency, inter-rater reliability, and researcher trust in AI-assisted qualitative workflows.
翻译:数字教育环境正朝着复杂的人工智能与人类对话方向发展,为研究者提供了大量能够深入揭示学习与教学过程的数据。然而,传统的质性分析方法依然是劳动密集型的瓶颈,严重制约了此类研究的开展规模。本文提出沙锥鸟系统——一种混合主动式系统,旨在构建海量对话数据与人类质性分析专长之间的桥梁。该系统通过将交互式研究者仪表板与智能体化大语言模型引擎紧密耦合,在保持方法论严谨性的同时实现了可扩展的分析。沙锥鸟通过部署情境感知的自动化去身份识别工作流(依托于安全的大学内部基础设施以确保数据隐私),解决了教育领域人工智能应用的关键障碍。此外,该系统采用模式约束的协同机制以消除大语言模型的幻觉现象,并确保严格遵循质性编码手册。集成的评估引擎支持持续以人工标注为基准对人工智能性能进行测评,从而形成模型优化与验证的迭代流程。我们提出一项用户研究方案,以评估本系统在提升研究效率、评分者间信度及研究者对人工智能辅助质性工作流信任度方面的效能。