Humans and modern vision models can reach similar classification accuracy while making systematically different kinds of mistakes - differing not in how often they err, but in who gets mistaken for whom, and in which direction. We show that these directional confusions reveal distinct inductive biases that are invisible to accuracy alone. Using matched human and deep vision model responses on a natural-image categorization task under 12 perturbation types, we quantify asymmetry in confusion matrices and link it to generalization geometry through a Rate-Distortion (RD) framework, summarized by three geometric signatures (slope (beta), curvature (kappa)) and efficiency (AUC). We find that humans exhibit broad but weak asymmetries, whereas deep vision models show sparser, stronger directional collapses. Robustness training reduces global asymmetry but fails to recover the human-like breadth-strength profile of graded similarity. Mechanistic simulations further show that different asymmetry organizations shift the RD frontier in opposite directions, even when matched for performance. Together, these results position directional confusions and RD geometry as compact, interpretable signatures of inductive bias under distribution shift.
翻译:人类与当代视觉模型在达到相似分类准确率的同时,会表现出系统性差异的错误模式——差异不在于错误频率本身,而在于混淆对象的选择方向。我们证明,这些方向性混淆揭示了单一准确率指标无法捕捉的独特归纳偏倚。通过匹配人类与深度视觉模型在12种扰动类型下的自然图像分类任务响应,我们量化了混淆矩阵的非对称性,并利用率失真(RD)框架将其与泛化几何相关联,总结为三个几何特征(斜率β、曲率κ)及效率指数AUC。研究发现,人类表现出广泛而微弱的非对称性,而深度视觉模型则呈现稀疏且强烈的方向性坍缩。鲁棒性训练可降低全局非对称性,但无法恢复类似人类的渐进相似性广度-强度曲线。机制模拟进一步表明,即便在性能匹配的条件下,不同的非对称组织方式会以相反方向移动RD前沿。综合而言,这些结果将方向性混淆与RD几何定位为分布偏移下归纳偏倚的简洁可解释特征。