The standard taxonomy of predictive uncertainty defines epistemic uncertainty as the part removable by collecting more data, while the standard measure identifies it with a mutual-information term. We prove the definition and the measure are extensionally inconsistent. On an explicit construction, the measure assigns all uncertainty to the epistemic class, yet no quantity of training data reduces it. Reducibility is instead a property of the pair (uncertainty, acquisition class), and the dichotomy resolves into three parts: aleatoric, sample-reducible epistemic, and mechanism-reducible epistemic uncertainty. An exact identity for the value of an observation shows that in-distribution data never reduces mechanism-irreducible uncertainty and generically increases it. Ensemble disagreement, the deployed epistemic estimate, tracks the training procedure rather than the epistemic term. It collapses to zero beneath a positive truth under consistent training, and equals hyperparameter-scaled initialization noise under interpolation. A finite-sample falsification test and seed-swept experiments confirm the theory.
翻译:预测不确定性的标准分类学将认知不确定性定义为可通过收集更多数据消除的部分,而标准度量则将其等同于互信息项。我们证明这一定义与度量在外延上不一致。在显式构造中,该度量将所有不确定性归入认知类,但任何数量的训练数据都无法减少它。可减少性实际上是(不确定性,获取类)这一配对的性质,由此二分法分解为三个部分:偶然不确定性、样本可减少的认知不确定性以及机制可减少的认知不确定性。观测值价值的精确恒等式表明,分布内数据永远不会减少机制不可减少的不确定性,并且通常会使其增加。集成分歧——即实际使用的认知估计——追踪的是训练过程而非认知项。在一致性训练下,它会在正真值下坍缩为零,而在插值条件下等于超参数缩放的初始化噪声。有限样本证伪检验和种子扫描实验证实了该理论。