Identity-Free Deferral For Unseen Experts

Learning to Defer (L2D) improves AI reliability in decision-critical environments by training AI to either make its own prediction or defer the decision to a human expert. A key challenge is adapting to unseen experts at test time, whose competence can differ from the training population. Current methods for this task, however, can falter when unseen experts are out-of-distribution (OOD) relative to the training population. We identify a core architectural flaw as the cause: they learn identity-conditioned policies by processing class-indexed signals in fixed coordinates, creating shortcuts that violate the problem's inherent permutation symmetry. We introduce Identity-Free Deferral (IFD), an architecture that enforces this symmetry by construction. From a few-shot context, IFD builds a query-independent Bayesian competence profile for each expert. It then supplies the deferral rejector with a low-dimensional, role-indexed state containing only structural information, such as the model's confidence in its top-ranked class and the expert's estimated skill for that same role, which obscures absolute class identities. We train IFD using an uncertainty-aware, context-only objective that removes the need for expensive query-time expert labels. We formally prove the permutation invariance of our approach, contrasting it with the generic non-invariance of standard population encoders. Experiments on medical imaging benchmarks and ImageNet-16H with real human annotators show that IFD consistently improves generalisation to unseen experts, with gains in OOD settings, all while using fewer annotations than alternative methods.

翻译：延迟学习（L2D）通过训练人工智能系统自主做出预测或将决策延迟至人类专家，提升了决策关键环境中AI的可靠性。该领域的一个核心挑战在于测试时需适应未知专家，其能力可能异于训练集中的专家群体。然而，现有方法在遇到与训练群体分布外（OOD）的未知专家时可能出现失效。我们指出其根本架构缺陷在于：这些方法通过处理固定坐标中类别索引信号来学习身份条件策略，形成了违反问题固有置换对称性的捷径。本文提出身份无关延迟决策（IFD）架构，该架构通过构造方式强制保持这种对称性。IFD通过少量样本上下文为每位专家构建查询无关的贝叶斯能力画像，随后向延迟拒绝器提供仅包含结构信息的低维角色索引状态——例如模型对其最高排名类别的置信度及专家在同一角色上的预估技能水平——从而模糊绝对类别身份。我们采用无需查询时专家标注的、仅依赖上下文的不确定性感知目标训练IFD。通过形式化证明本方法的置换不变性，并与标准群体编码器普遍存在的非不变性形成对比。在医学影像基准测试及包含真实人类标注者的ImageNet-16H数据集上的实验表明，IFD能持续提升对未知专家的泛化能力，在OOD场景中取得显著增益，且相较于其他方法所需标注更少。