Recent works have emerged in multi-annotator learning that shift focus from Consensus-oriented Learning (CoL), which aggregates multiple annotations into a single ground-truth prediction, to Individual Tendency Learning (ITL), which models annotator-specific labeling behavior patterns (i.e., tendency) to provide explanation analysis for understanding annotator decisions. However, no evaluation framework currently exists to assess whether ITL methods truly capture individual tendencies and provide meaningful behavioral explanations. To address this gap, we propose the first unified evaluation framework with two novel metrics: (1) Difference of Inter-annotator Consistency (DIC) quantifies how well models capture annotator tendencies by comparing predicted inter-annotator similarity structures with ground-truth; (2) Behavior Alignment Explainability (BAE) evaluates how well model explanations reflect annotator behavior and decision relevance by aligning explainability-derived with ground-truth labeling similarity structures via Multidimensional Scaling (MDS). Extensive experiments validate the effectiveness of our proposed evaluation framework.
翻译:近年来,多标注者学习领域的研究方向逐渐从以共识为导向的学习(CoL)——即将多个标注聚合为单一真实预测——转向个体倾向学习(ITL),后者通过建模标注者特定的标注行为模式(即倾向)来为理解标注者决策提供解释性分析。然而,目前尚缺乏评估框架来判断ITL方法是否真正捕捉了个体倾向并提供了有意义的行为解释。为填补这一空白,我们提出了首个统一评估框架,包含两项新颖指标:(1)标注者间一致性差异(DIC),通过比较模型预测的标注者间相似性结构与真实结构,量化模型捕捉标注者倾向的能力;(2)行为对齐可解释性(BAE),通过将可解释性方法导出的相似性结构与真实标注相似性结构在多重维度缩放(MDS)下进行对齐,评估模型解释反映标注者行为及决策相关性的程度。大量实验验证了我们所提评估框架的有效性。