Deep Ensembles are a simple, reliable, and effective method of improving both the predictive performance and uncertainty estimates of deep learning approaches. However, they are widely criticised as being computationally expensive, due to the need to deploy multiple independent models. Recent work has challenged this view, showing that for predictive accuracy, ensembles can be more computationally efficient (at inference) than scaling single models within an architecture family. This is achieved by cascading ensemble members via an early-exit approach. In this work, we investigate extending these efficiency gains to tasks related to uncertainty estimation. As many such tasks, e.g. selective classification, are binary classification, our key novel insight is to only pass samples within a window close to the binary decision boundary to later cascade stages. Experiments on ImageNet-scale data across a number of network architectures and uncertainty tasks show that the proposed window-based early-exit approach is able to achieve a superior uncertainty-computation trade-off compared to scaling single models. For example, a cascaded EfficientNet-B2 ensemble is able to achieve similar coverage at 5% risk as a single EfficientNet-B4 with <30% the number of MACs. We also find that cascades/ensembles give more reliable improvements on OOD data vs scaling models up. Code for this work is available at: https://github.com/Guoxoug/window-early-exit.
翻译:深度集成是一种简单、可靠且有效的方法,既能提升深度学习方法的预测性能,也能改善其不确定性估计。然而,由于需要部署多个独立模型,这种方法被广泛认为计算成本高昂。近期研究对这一观点提出挑战,表明在预测精度方面,集成方法(在推理阶段)的计算效率可能高于相同架构族内单一模型的扩展。这通过采用提前退出机制级联集成成员来实现。本文旨在将这些效率优势扩展到与不确定性估计相关的任务中。由于许多此类任务(如选择性分类)属于二元分类,我们的关键新见解是:仅将二元决策边界附近窗口内的样本传递到后续级联阶段。在ImageNet规模数据集上,针对多种网络架构和不确定性任务进行的实验表明,与扩展单一模型相比,所提出的基于窗口的提前退出方法能够实现更优的不确定性-计算权衡。例如,级联的EfficientNet-B2集成能够在5%风险水平下达到与单一EfficientNet-B4相似的覆盖率,而计算量(MACs)不足后者的30%。我们还发现,与扩展模型相比,级联/集成方法在分布外数据上能提供更可靠的性能提升。本研究的代码可在以下地址获取:https://github.com/Guoxoug/window-early-exit。