Recent studies on deep ensembles have identified the sharpness of the local minima of individual learners and the diversity of the ensemble members as key factors in improving test-time performance. Building on this, our study investigates the interplay between sharpness and diversity within deep ensembles, illustrating their crucial role in robust generalization to both in-distribution (ID) and out-of-distribution (OOD) data. We discover a trade-off between sharpness and diversity: minimizing the sharpness in the loss landscape tends to diminish the diversity of individual members within the ensemble, adversely affecting the ensemble's improvement. The trade-off is justified through our theoretical analysis and verified empirically through extensive experiments. To address the issue of reduced diversity, we introduce SharpBalance, a novel training approach that balances sharpness and diversity within ensembles. Theoretically, we show that our training strategy achieves a better sharpness-diversity trade-off. Empirically, we conducted comprehensive evaluations in various data sets (CIFAR-10, CIFAR-100, TinyImageNet) and showed that SharpBalance not only effectively improves the sharpness-diversity trade-off, but also significantly improves ensemble performance in ID and OOD scenarios.
翻译:近期关于深度集成的研究表明,个体学习器局部极小值的锐度与集成成员的多样性是提升测试时性能的关键因素。在此基础上,本研究探讨了深度集成中锐度与多样性之间的相互作用,阐明了二者对于提升模型在分布内(ID)与分布外(OOD)数据上鲁棒泛化能力的关键作用。我们发现锐度与多样性之间存在一种权衡关系:最小化损失景观的锐度往往会降低集成中个体成员的多样性,从而对集成性能的提升产生不利影响。这一权衡关系通过我们的理论分析得到了论证,并经过大量实验的实证验证。为解决多样性降低的问题,我们提出了SharpBalance——一种新颖的训练方法,旨在平衡集成中的锐度与多样性。理论上,我们证明了该训练策略能够实现更优的锐度-多样性权衡。实证方面,我们在多个数据集(CIFAR-10、CIFAR-100、TinyImageNet)上进行了全面评估,结果表明SharpBalance不仅能有效改善锐度-多样性权衡,还能显著提升集成在ID与OOD场景下的性能。