EssayCBM：面向透明作文评分的评分标准对齐概念瓶颈模型 (EssayCBM: Rubric-Aligned Concept Bottleneck Models for Transparent Essay Grading) - 专知论文

会员服务 ·

0

对齐 · 黑盒 · 系统 · 问责 · 协同 ·

2025 年 12 月 23 日

EssayCBM: Rubric-Aligned Concept Bottleneck Models for Transparent Essay Grading

翻译：EssayCBM：面向透明作文评分的评分标准对齐概念瓶颈模型

Kumar Satvik Chaudhary,Chengshuai Zhao,Fan Zhang,Yung Hin Tse,Garima Agrawal,Yuli Deng,Huan Liu

Understanding how automated grading systems evaluate essays remains a significant challenge for educators and students, especially when large language models function as black boxes. We introduce EssayCBM, a rubric-aligned framework that prioritizes interpretability in essay assessment. Instead of predicting grades directly from text, EssayCBM evaluates eight writing concepts, such as Thesis Clarity and Evidence Use, through dedicated prediction heads on an encoder. These concept scores form a transparent bottleneck, and a lightweight network computes the final grade using only concepts. Instructors can adjust concept predictions and instantly view the updated grade, enabling accountable human-in-the-loop evaluation. EssayCBM matches black-box performance while offering actionable, concept-level feedback through an intuitive web interface.

翻译：理解自动评分系统如何评估作文，对教育工作者和学生而言仍是一个重大挑战，尤其是在大型语言模型作为黑盒运行的情况下。我们提出了EssayCBM，这是一个优先考虑作文评估可解释性的、与评分标准对齐的框架。EssayCBM并非直接从文本预测分数，而是通过编码器上的专用预测头来评估八个写作概念（如论点清晰度和论据使用）。这些概念分数构成了一个透明的瓶颈，一个轻量级网络仅使用这些概念来计算最终分数。教师可以调整概念预测并即时查看更新后的分数，从而实现可问责的人机协同评估。EssayCBM在匹配黑盒模型性能的同时，通过直观的网页界面提供可操作的概念级反馈。

0

相关内容

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

42+阅读 · 2020年4月11日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知

37+阅读 · 2020年6月11日

IBM-小样本学习（Few-shot Learning）State of the art 方法及论文讲解

IBM-小样本学习（Few-shot Learning）State of the art 方法及论文讲解

专知

105+阅读 · 2019年4月15日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

9+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis

Arxiv

0+阅读 · 1月8日

AgentOCR: Reimagining Agent History via Optical Self-Compression

Arxiv

0+阅读 · 1月8日

FalconFS: Distributed File System for Large-Scale Deep Learning Pipeline

Arxiv

0+阅读 · 1月8日

CrackSegFlow: Controllable Flow Matching Synthesis for Generalizable Crack Segmentation with a 50K Image-Mask Benchmark

Arxiv

0+阅读 · 1月8日

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Arxiv

0+阅读 · 1月6日

VIP会员

文章信息

相关主题

相关VIP内容

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

42+阅读 · 2020年4月11日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向小规模遥感应用引入思维链推理与多模态小语言模型》

《大国竞争时代的美国太空竞争力》50页报告

网络中心战：未来冲突

《自主无人机不会取代战斗机飞行员，将成为其僚机：协同作战飞机是下一代无人作战飞机》报告

相关资讯

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知

37+阅读 · 2020年6月11日

IBM-小样本学习（Few-shot Learning）State of the art 方法及论文讲解

IBM-小样本学习（Few-shot Learning）State of the art 方法及论文讲解

专知

105+阅读 · 2019年4月15日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

相关论文

Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis

Arxiv

0+阅读 · 1月8日

AgentOCR: Reimagining Agent History via Optical Self-Compression

Arxiv

0+阅读 · 1月8日

FalconFS: Distributed File System for Large-Scale Deep Learning Pipeline

Arxiv

0+阅读 · 1月8日

CrackSegFlow: Controllable Flow Matching Synthesis for Generalizable Crack Segmentation with a 50K Image-Mask Benchmark

Arxiv

0+阅读 · 1月8日

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Arxiv

0+阅读 · 1月6日

相关基金

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

9+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员