Training a diverse ensemble of models has several practical applications such as providing candidates for model selection with better out-of-distribution (OOD) generalization, and enabling the detection of OOD samples via Bayesian principles. An existing approach to diverse ensemble training encourages the models to disagree on provided OOD samples. However, the approach is computationally expensive and it requires well-separated ID and OOD examples, such that it has only been demonstrated in small-scale settings. $\textbf{Method.}$ This work presents a method for Scalable Ensemble Diversification (SED) applicable to large-scale settings (e.g. ImageNet) that does not require OOD samples. Instead, SED identifies hard training samples on the fly and encourages the ensemble members to disagree on these. To improve scaling, we show how to avoid the expensive computations in existing methods of exhaustive pairwise disagreements across models. $\textbf{Results.}$ We evaluate the benefits of diversification with experiments on ImageNet. First, for OOD generalization, we observe large benefits from the diversification in multiple settings including output-space (classical) ensembles and weight-space ensembles (model soups). Second, for OOD detection, we turn the diversity of ensemble hypotheses into a novel uncertainty score estimator that surpasses a large number of OOD detection baselines. Code is available here: https://github.com/AlexanderRubinstein/diverse-universe-public.
翻译:训练一个多样化的模型集成具有多种实际应用价值,例如为模型选择提供具有更好分布外泛化能力的候选模型,以及通过贝叶斯原理实现分布外样本的检测。现有的一种多样化集成训练方法鼓励模型在给定的分布外样本上产生分歧。然而,该方法计算成本高昂,且需要严格分离的分布内与分布外样本,因此目前仅在小规模场景中得到验证。$\textbf{方法.}$ 本研究提出了一种适用于大规模场景(如ImageNet)的可扩展集成多样化方法,该方法无需依赖分布外样本。SED通过动态识别训练中的困难样本,并鼓励集成成员对这些样本产生预测分歧。为提升可扩展性,我们展示了如何避免现有方法中跨模型穷举成对分歧计算的高昂开销。$\textbf{结果.}$ 我们通过在ImageNet上的实验评估了多样化带来的优势。首先,在分布外泛化方面,我们在包括输出空间(经典)集成与权重空间集成(模型汤)在内的多种设置中均观察到多样化带来的显著效益。其次,在分布外检测方面,我们将集成假设的多样性转化为一种新颖的不确定性评分估计器,其性能超越了大量分布外检测基线方法。代码已开源:https://github.com/AlexanderRubinstein/diverse-universe-public。