Recent research underscores the pivotal role of the Out-of-Distribution (OOD) feature representation field scale in determining the efficacy of models in OOD detection. Consequently, the adoption of model ensembles has emerged as a prominent strategy to augment this feature representation field, capitalizing on anticipated model diversity. However, our introduction of novel qualitative and quantitative model ensemble evaluation methods, specifically Loss Basin/Barrier Visualization and the Self-Coupling Index, reveals a critical drawback in existing ensemble methods. We find that these methods incorporate weights that are affine-transformable, exhibiting limited variability and thus failing to achieve the desired diversity in feature representation. To address this limitation, we elevate the dimensions of traditional model ensembles, incorporating various factors such as different weight initializations, data holdout, etc., into distinct supervision tasks. This innovative approach, termed Multi-Comprehension (MC) Ensemble, leverages diverse training tasks to generate distinct comprehensions of the data and labels, thereby extending the feature representation field. Our experimental results demonstrate the superior performance of the MC Ensemble strategy in OOD detection compared to both the naive Deep Ensemble method and a standalone model of comparable size. This underscores the effectiveness of our proposed approach in enhancing the model's capability to detect instances outside its training distribution.
翻译:近期研究强调了分布外检测中特征表示场尺度对模型效能的关键作用。因此,模型集成作为一种利用预期模型多样性来扩展特征表示场的策略,已成为主流方法。然而,我们提出的新型定性与定量模型集成评估方法——具体为损失盆地/屏障可视化与自耦合指数——揭示了现有集成方法的关键缺陷。研究发现,这些方法采用可仿射变换的权重,其变异性有限,因而无法实现特征表示所需的多样性。为克服这一局限,我们将传统模型集成的维度提升至包含不同权重初始化、数据留出等多种因素的差异化监督任务中。这一创新方法被称为多理解集成,通过利用多样化训练任务生成对数据与标签的不同理解,从而扩展特征表示场。实验结果表明,相较于朴素深度集成方法及同等规模的独立模型,多理解集成策略在分布外检测中展现出优越性能。这充分验证了我们提出的方法在增强模型检测训练分布外实例能力方面的有效性。