Modernizing Ground Truth: Four Shifts Toward Improving Reliability and Validity in AI in Education - 专知论文

会员服务 ·

0

标注 · 真实值 · AI · 多峰值 · 有偏 ·

Modernizing Ground Truth: Four Shifts Toward Improving Reliability and Validity in AI in Education

翻译：暂无翻译

Danielle R. Thomas,Conrad Borchers,Kirk P. Vanacore,Kenneth R. Koedinger,René F. Kizilcec

from arxiv, Accepted as full paper to the 27th International Conference on Artificial Intelligence in Education (AIED 2026)

Generative Artificial Intelligence (GenAI) is now widespread in education, yet the efficacy of GenAI systems remains constrained by the quality and interpretation of the labeled data used to train and evaluate them. Studies commonly report inter-rater reliability (IRR), often summarized by a single coefficient such as Cohen's kappa (k), as a gatekeeper to ``ground truth.'' We argue that many educational assessment and practice support settings include challenges, such as high-inference constructs, skewed label distributions, and temporally segmented multimodal data, which yield potential misapplication or misinterpretation of threshold-based heuristics for IRR. The growing use of large language models as annotators and judges introduces risks such as automation bias and circular validation. We propose four practical shifts for establishing ground truth: (1) treat IRR as a diagnostic signal to localize disagreement and refine constructs rather than a mechanical acceptance threshold (e.g., k > 0.8); (2) require transparent reporting of rater expertise, codebook development, reconciliation procedures, and segmentation rules; (3) mitigate risks in LLM annotation through bias audits and verification workflows; and (4) complement agreement statistics with validity and effectiveness evidence for the intended use, including uncertainty-aware labeling (e.g., assigning different labels to the same item to capture nuance), criterion-related checks (e.g., predictive tests to check if labels forecast the intended outcome), and close-the-loop evaluations of whether systems trained on these labels improve learning beyond a reasonable control. We illustrate these shifts through case studies of multimodal tutoring data and provide actionable recommendations toward strengthening the evidence base of labeled AIED datasets.

翻译：暂无翻译

0

相关内容

用于自动驾驶的生成式人工智能：前沿与机遇

用于自动驾驶的生成式人工智能：前沿与机遇

专知会员服务

26+阅读 · 2025年5月16日

AI教育的落地深度研究：复盘、对比和商业化

AI教育的落地深度研究：复盘、对比和商业化

专知会员服务

16+阅读 · 2025年4月3日

2024年人工智能+教育行业发展研究报告

2024年人工智能+教育行业发展研究报告

专知会员服务

34+阅读 · 2024年8月5日

生成式人工智能在可视化中的应用：现状与未来方向

生成式人工智能在可视化中的应用：现状与未来方向

专知会员服务

41+阅读 · 2024年6月8日

人工智能赋能教育专题《人工智能 + 教育：关键技术及典型应用场景》，北京师范大学

人工智能赋能教育专题《人工智能 + 教育：关键技术及典型应用场景》，北京师范大学

专知会员服务

62+阅读 · 2022年3月24日

【纽约大学-AI研讨会】现代人工智能（Modern Artificial Intelligence）

【纽约大学-AI研讨会】现代人工智能（Modern Artificial Intelligence）

专知会员服务

27+阅读 · 2019年11月10日

智能教育发展现状与未来趋势，科大讯飞AI研究院竺博副院长，第八届全国社会媒体处理大会SMP2019

智能教育发展现状与未来趋势，科大讯飞AI研究院竺博副院长，第八届全国社会媒体处理大会SMP2019

专知会员服务

13+阅读 · 2019年10月24日

智能教育发展现状与未来趋势，哈尔滨工业大学刘挺教授，第八届全国社会媒体处理大会SMP2019

智能教育发展现状与未来趋势，哈尔滨工业大学刘挺教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

AAAI 2020 | 中科大：智能教育系统中的神经认知诊断，从数据中学习交互函数

AAAI 2020 | 中科大：智能教育系统中的神经认知诊断，从数据中学习交互函数

AI科技评论

24+阅读 · 2020年1月11日

浅谈群体智能——新一代AI的重要方向

浅谈群体智能——新一代AI的重要方向

中国科学院自动化研究所

44+阅读 · 2019年10月16日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能的现状与未来（附PPT）

人工智能的现状与未来（附PPT）

人工智能学家

76+阅读 · 2019年3月27日

人工智能在教育领域的应用探析

人工智能在教育领域的应用探析

MOOC

14+阅读 · 2019年3月16日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【知识图谱】知识图谱+人工智能=新型网络信息体系

【知识图谱】知识图谱+人工智能=新型网络信息体系

产业智能官

14+阅读 · 2018年11月18日

教育部发布重磅AI计划，将建设100个“AI+”特色专业

教育部发布重磅AI计划，将建设100个“AI+”特色专业

AI100

18+阅读 · 2018年4月9日

群体智能：新一代人工智能的重要方向

群体智能：新一代人工智能的重要方向

走向智能论坛

12+阅读 · 2017年8月16日

战略构想、知识搜寻与双元导向下企业技术创新能力演进：基于适应性演进和协同视角

国家自然科学基金

2+阅读 · 2015年12月31日

基于智慧的下一代网络资源优化机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于格型结构与CS理论的高效数字系统设计与实现研究

国家自然科学基金

0+阅读 · 2015年12月31日

输入约束下的多智能体系统完全分布式协调控制研究

国家自然科学基金

5+阅读 · 2015年12月31日

面向智能穿戴设备的三维图形网格简化与渐进显示方法

国家自然科学基金

1+阅读 · 2015年12月31日

面向大数据的知识表示、推理、在线学习理论及应用研究

国家自然科学基金

12+阅读 · 2014年12月31日

基于代数结构及公理语义的泛型约束方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

Towards an Ethical AI Curriculum: A Pan-African, Culturally Contextualized Framework for Primary and Secondary Education

Arxiv

0+阅读 · 4月30日

Workmanship of Learning: Embedding Craftsmanship Values in AI-Integrated Educational Tools

Arxiv

0+阅读 · 4月8日

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

Arxiv

0+阅读 · 4月2日

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

Arxiv

0+阅读 · 4月1日

Learning to Program Alongside AI: Critical Thinking, AI Ethics, and Gendered Patterns of German Secondary School Students

Arxiv

0+阅读 · 3月27日

To Use or Not to Use: Investigating Student Perceptions of Faculty Generative AI Usage in Higher Education

Arxiv

0+阅读 · 3月26日

Generative AI User Experience: Developing Human--AI Epistemic Partnership

Arxiv

0+阅读 · 3月25日

Three Years with Classroom AI in Introductory Programming: Shifts in Student Awareness, Interaction, and Performance

Arxiv

0+阅读 · 3月24日

Towards an AI Buddy for every University Student? Exploring Students' Experiences, Attitudes and Motivations towards AI and AI-based Study Companions

Arxiv

0+阅读 · 3月21日

From School AI Readiness to Student AI Literacy: A National Multilevel Mediation Analysis of Institutional Capacity and Teacher Capability

Arxiv

0+阅读 · 3月20日

VIP会员

文章信息

相关主题

最新内容

《美陆军装备维护程序（2026版）》

《美陆军装备维护程序（2026版）》

专知会员服务

3+阅读 · 今天4:23

第五代作战任务规划：集成系统与算法

第五代作战任务规划：集成系统与算法

专知会员服务

6+阅读 · 今天4:11

《北约科技组织2025年亮点报告》

《北约科技组织2025年亮点报告》

专知会员服务

2+阅读 · 今天3:42

《深度卷积神经网络与在印太地区构建潜在暴力》

《深度卷积神经网络与在印太地区构建潜在暴力》

专知会员服务

2+阅读 · 今天3:09

《ARMOR 2025：一个面向军事领域的基准，用于评估大语言模型安全性》

《ARMOR 2025：一个面向军事领域的基准，用于评估大语言模型安全性》

专知会员服务

4+阅读 · 今天3:07

人工智能在防空反导中的应用系统性综述

人工智能在防空反导中的应用系统性综述

专知会员服务

5+阅读 · 今天2:45

新兴反无人机技术与不对称防御对策

新兴反无人机技术与不对称防御对策

专知会员服务

5+阅读 · 5月6日

《美空军条令出版物 3-60，目标定位（2026版）》

《美空军条令出版物 3-60，目标定位（2026版）》

专知会员服务

14+阅读 · 5月6日

多模态多智能体AI系统赋能军事态势感知：与单智能体方法的比较研究

多模态多智能体AI系统赋能军事态势感知：与单智能体方法的比较研究

专知会员服务

14+阅读 · 5月6日

《无人机在冲突地区提供紧急医疗与外科支持》

《无人机在冲突地区提供紧急医疗与外科支持》

专知会员服务

6+阅读 · 5月6日

《定向能武器交战授权治理管道》

《定向能武器交战授权治理管道》

专知会员服务

6+阅读 · 5月6日

《防空协同制导：用于中段目标分配的多目标成本函数》

《防空协同制导：用于中段目标分配的多目标成本函数》

专知会员服务

8+阅读 · 5月6日

《人工智能与海军作战》最新报告

《人工智能与海军作战》最新报告

专知会员服务

7+阅读 · 5月6日

人工智能专题：中国人工智能系列白皮书-具身智能(2026)，100页pdf

人工智能专题：中国人工智能系列白皮书-具身智能(2026)，100页pdf

专知会员服务

5+阅读 · 5月6日

【ICML spotlight 2026】HELIX：通过可学习特征身份嵌入实现时间序列插补的混合编码框架

【ICML spotlight 2026】HELIX：通过可学习特征身份嵌入实现时间序列插补的混合编码框架

专知会员服务

4+阅读 · 5月6日

相关VIP内容

用于自动驾驶的生成式人工智能：前沿与机遇

用于自动驾驶的生成式人工智能：前沿与机遇

专知会员服务

26+阅读 · 2025年5月16日

AI教育的落地深度研究：复盘、对比和商业化

AI教育的落地深度研究：复盘、对比和商业化

专知会员服务

16+阅读 · 2025年4月3日

2024年人工智能+教育行业发展研究报告

2024年人工智能+教育行业发展研究报告

专知会员服务

34+阅读 · 2024年8月5日

生成式人工智能在可视化中的应用：现状与未来方向

生成式人工智能在可视化中的应用：现状与未来方向

专知会员服务

41+阅读 · 2024年6月8日

人工智能赋能教育专题《人工智能 + 教育：关键技术及典型应用场景》，北京师范大学

人工智能赋能教育专题《人工智能 + 教育：关键技术及典型应用场景》，北京师范大学

专知会员服务

62+阅读 · 2022年3月24日

【纽约大学-AI研讨会】现代人工智能（Modern Artificial Intelligence）

【纽约大学-AI研讨会】现代人工智能（Modern Artificial Intelligence）

专知会员服务

27+阅读 · 2019年11月10日

智能教育发展现状与未来趋势，科大讯飞AI研究院竺博副院长，第八届全国社会媒体处理大会SMP2019

智能教育发展现状与未来趋势，科大讯飞AI研究院竺博副院长，第八届全国社会媒体处理大会SMP2019

专知会员服务

13+阅读 · 2019年10月24日

智能教育发展现状与未来趋势，哈尔滨工业大学刘挺教授，第八届全国社会媒体处理大会SMP2019

智能教育发展现状与未来趋势，哈尔滨工业大学刘挺教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

第五代作战任务规划：集成系统与算法

《深度卷积神经网络与在印太地区构建潜在暴力》

《美陆军装备维护程序（2026版）》

《北约科技组织2025年亮点报告》

相关资讯

AAAI 2020 | 中科大：智能教育系统中的神经认知诊断，从数据中学习交互函数

AAAI 2020 | 中科大：智能教育系统中的神经认知诊断，从数据中学习交互函数

AI科技评论

24+阅读 · 2020年1月11日

浅谈群体智能——新一代AI的重要方向

浅谈群体智能——新一代AI的重要方向

中国科学院自动化研究所

44+阅读 · 2019年10月16日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能的现状与未来（附PPT）

人工智能的现状与未来（附PPT）

人工智能学家

76+阅读 · 2019年3月27日

人工智能在教育领域的应用探析

人工智能在教育领域的应用探析

MOOC

14+阅读 · 2019年3月16日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【知识图谱】知识图谱+人工智能=新型网络信息体系

【知识图谱】知识图谱+人工智能=新型网络信息体系

产业智能官

14+阅读 · 2018年11月18日

教育部发布重磅AI计划，将建设100个“AI+”特色专业

教育部发布重磅AI计划，将建设100个“AI+”特色专业

AI100

18+阅读 · 2018年4月9日

群体智能：新一代人工智能的重要方向

群体智能：新一代人工智能的重要方向

走向智能论坛

12+阅读 · 2017年8月16日

相关论文

Towards an Ethical AI Curriculum: A Pan-African, Culturally Contextualized Framework for Primary and Secondary Education

Arxiv

0+阅读 · 4月30日

Workmanship of Learning: Embedding Craftsmanship Values in AI-Integrated Educational Tools

Arxiv

0+阅读 · 4月8日

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

Arxiv

0+阅读 · 4月2日

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

Arxiv

0+阅读 · 4月1日

Learning to Program Alongside AI: Critical Thinking, AI Ethics, and Gendered Patterns of German Secondary School Students

Arxiv

0+阅读 · 3月27日

To Use or Not to Use: Investigating Student Perceptions of Faculty Generative AI Usage in Higher Education

Arxiv

0+阅读 · 3月26日

Generative AI User Experience: Developing Human--AI Epistemic Partnership

Arxiv

0+阅读 · 3月25日

Three Years with Classroom AI in Introductory Programming: Shifts in Student Awareness, Interaction, and Performance

Arxiv

0+阅读 · 3月24日

Towards an AI Buddy for every University Student? Exploring Students' Experiences, Attitudes and Motivations towards AI and AI-based Study Companions

Arxiv

0+阅读 · 3月21日

From School AI Readiness to Student AI Literacy: A National Multilevel Mediation Analysis of Institutional Capacity and Teacher Capability

Arxiv

0+阅读 · 3月20日

相关基金

战略构想、知识搜寻与双元导向下企业技术创新能力演进：基于适应性演进和协同视角

国家自然科学基金

2+阅读 · 2015年12月31日

基于智慧的下一代网络资源优化机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于格型结构与CS理论的高效数字系统设计与实现研究

国家自然科学基金

0+阅读 · 2015年12月31日

输入约束下的多智能体系统完全分布式协调控制研究

国家自然科学基金

5+阅读 · 2015年12月31日

面向智能穿戴设备的三维图形网格简化与渐进显示方法

国家自然科学基金

1+阅读 · 2015年12月31日

面向大数据的知识表示、推理、在线学习理论及应用研究

国家自然科学基金

12+阅读 · 2014年12月31日

基于代数结构及公理语义的泛型约束方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员