Evaluating the performance of students in higher education is essential for gauging the effectiveness of teaching methods and achieving greater equality of opportunities for all. In this study, we investigate the correlation between two teachers' grading practices in a deep learning course at the master's level, offered at CentraleSup\'elec. The two teachers, who have distinct teaching styles, were responsible for marking the final project oral presentation. Our results indicate a significant positive correlation (0.76) between the two teachers' grading practices, suggesting that their assessments of students' performance are consistent. Although consistent with each other, grades do not seem to be fully reproducible from one examiner to the other suggesting serious drawbacks of only using one examiner for oral projects. Furthermore, we observed that the maximum difference between the grades assigned by the two examiners was 12.5%, with a mean of 6.3\% (and median of 5.0\%), highlighting the potential impact of inter-examiner variability on students' final grades.
翻译:评估高等教育中学生的表现对于衡量教学方法的有效性以及为所有学生创造更均等的机会至关重要。在本研究中,我们探究了CentraleSupélec开设的一门硕士层次深度学习课程中两位教师评分实践之间的相关性。两位教学风格迥异的教师负责为最终项目口头报告进行评分。我们的结果表明,两位教师评分实践之间存在显著的正相关(0.76),这表明他们对学生表现的评估具有一致性。尽管彼此一致,但评分似乎无法完全在不同评阅者之间复现,这揭示了仅由一个评阅者进行口头项目评分的严重缺陷。此外,我们观察到两位评阅者所评分数的最大差异为12.5%,平均差异为6.3%(中位数为5.0%),凸显了评阅者间差异对学生最终成绩的潜在影响。