Among Bayesian methods, Monte-Carlo dropout provides principled tools for evaluating the epistemic uncertainty of neural networks. Its popularity recently led to seminal works that proposed activating the dropout layers only during inference for evaluating uncertainty. This approach, which we call dropout injection, provides clear benefits over its traditional counterpart (which we call embedded dropout) since it allows one to obtain a post hoc uncertainty measure for any existing network previously trained without dropout, avoiding an additional, time-consuming training process. Unfortunately, no previous work compared injected and embedded dropout; therefore, we provide the first thorough investigation, focusing on regression problems. The main contribution of our work is to provide guidelines on the effective use of injected dropout so that it can be a practical alternative to the current use of embedded dropout. In particular, we show that its effectiveness strongly relies on a suitable scaling of the corresponding uncertainty measure, and we discuss the trade-off between negative log-likelihood and calibration error as a function of the scale factor. Experimental results on UCI data sets and crowd counting benchmarks support our claim that dropout injection can effectively behave as a competitive post hoc uncertainty quantification technique.
翻译:在贝叶斯方法中,蒙特卡洛丢弃为评估神经网络的认知不确定性提供了理论工具。其近年来的流行促使开创性研究提出仅在推理阶段激活丢弃层以评估不确定性。我们将此方法称为"丢弃注入",相较于传统方法(称为"嵌入式丢弃"),其优势显著:允许对先前未使用丢弃训练的任何现有网络获取事后不确定性度量,无需额外耗时的训练过程。遗憾的是,此前尚无研究对比注入式与嵌入式丢弃,因此我们首次开展系统调查,聚焦回归问题。本研究的主要贡献在于:提供有效使用注入式丢弃的指导准则,使其可作为当前嵌入式丢弃的实用替代方案。具体而言,我们证明该方法的有效性强烈依赖于对相应不确定性度量的适当缩放,并讨论负对数似然与校准误差作为缩放因子的函数关系。在UCI数据集和人群计数基准上的实验结果支持我们的论断:丢弃注入可有效成为具有竞争力的事后不确定性量化技术。