Central to human-aligned AI is understanding the benefits of human-elicited labels over synthetic alternatives. While human soft-labels improve calibration by capturing uncertainty, prior studies conflate these benefits with the implicit correction of mislabeled data (mode shifts), obscuring true effects of soft-labels. We present a controlled audit of soft-label learning across MNIST and a synthetic variant, re-annotating subsets to extract human uncertainty. By decoupling soft-label supervision from underlying label mode shifts, we show that while human soft-labels do provide accuracy gains, their larger value lies in acting as a regularizer that improves model calibration on difficult samples and promotes stable convergence across training runs. Dataset cartography reveals models trained on human soft-labels mirror human uncertainty, whereas those trained on synthetic labels fail to align with humans. Broadly, this work provides a diagnostic testbed for human-AI uncertainty alignment.
翻译:实现与人类对齐的人工智能,关键在于理解人类标注相较于合成替代方案的益处。尽管人类软标签通过捕捉不确定性改善了模型校准,但先前的研究将这些益处与错误标注数据(模式偏移)的隐式修正混为一谈,从而模糊了软标签的真实效果。我们在MNIST数据集及其合成变体上对软标签学习进行了受控审计,通过对子集重新标注来提取人类不确定性。通过将软标签监督与底层标签模式偏移解耦,我们证明人类软标签虽能带来准确率的提升,但其更大价值在于充当正则化项:既能改善模型在困难样本上的校准效果,又能促进不同训练轮次间的稳定收敛。数据集图谱显示,基于人类软标签训练的模型能够镜像人类不确定性,而基于合成标签训练的模型则无法与人类对齐。总体而言,本研究为人类-人工智能不确定性对齐提供了诊断性测试平台。