Uncertainty estimation methods using deep learning approaches strive against separating how uncertain the state of the world manifests to us via measurement (objective end) from the way this gets scrambled with the model specification and training procedure used to predict such state (subjective means) -- e.g., number of neurons, depth, connections, priors (if the model is bayesian), weight initialization, etc. This poses the question of the extent to which one can eliminate the degrees of freedom associated with these specifications and still being able to capture the objective end. Here, a novel non-parametric quantile estimation method for continuous random variables is introduced, based on the simplest neural network architecture with one degree of freedom: a single neuron. Its advantage is first shown in synthetic experiments comparing with the quantile estimation achieved from ranking the order statistics (specifically for small sample size) and with quantile regression. In real-world applications, the method can be used to quantify predictive uncertainty under the split conformal prediction setting, whereby prediction intervals are estimated from the residuals of a pre-trained model on a held-out validation set and then used to quantify the uncertainty in future predictions -- the single neuron used here as a structureless ``thermometer'' that measures how uncertain the pre-trained model is. Benchmarking regression and classification experiments demonstrate that the method is competitive in quality and coverage with state-of-the-art solutions, with the added benefit of being more computationally efficient.
翻译:基于深度学习方法的不确定性估计技术致力于区分:客观端——世界状态经由测量所呈现的不确定性,与主观手段——用于预测该状态的模型规范与训练过程(如神经元数量、网络深度、连接方式、贝叶斯模型的先验设定、权重初始化等)所引入的混淆效应。这引出一个关键问题:在多大程度上可以消除这些规范参数的自由度,仍能有效捕获客观端的不确定性?本文针对连续随机变量提出一种新型非参数分位数估计方法,该方法基于最简单的单自由度神经网络架构——单神经元。通过合成实验,首先展示了该方法相较于基于顺序统计量排序(尤其在小样本场景下)及分位数回归的优势。在实际应用中,本方法可在分裂共形预测框架下量化预测不确定性:通过预训练模型在独立验证集上的残差估计预测区间,进而用于未来预测的不确定性量化——此时单神经元充当无结构化的"温度计",用于测量预训练模型的不确定性程度。回归与分类基准实验表明,该方法在质量与覆盖度上均能与前沿方案竞争,且更具计算效率优势。