Ensembling a neural network is a widely recognized approach to enhance model performance, estimate uncertainty, and improve robustness in deep supervised learning. However, deep ensembles often come with high computational costs and memory demands. In addition, the efficiency of a deep ensemble is related to diversity among the ensemble members which is challenging for large, over-parameterized deep neural networks. Moreover, ensemble learning has not yet seen such widespread adoption, and it remains a challenging endeavor for self-supervised or unsupervised representation learning. Motivated by these challenges, we present a novel self-supervised training regime that leverages an ensemble of independent sub-networks, complemented by a new loss function designed to encourage diversity. Our method efficiently builds a sub-model ensemble with high diversity, leading to well-calibrated estimates of model uncertainty, all achieved with minimal computational overhead compared to traditional deep self-supervised ensembles. To evaluate the effectiveness of our approach, we conducted extensive experiments across various tasks, including in-distribution generalization, out-of-distribution detection, dataset corruption, and semi-supervised settings. The results demonstrate that our method significantly improves prediction reliability. Our approach not only achieves excellent accuracy but also enhances calibration, surpassing baseline performance across a wide range of self-supervised architectures in computer vision, natural language processing, and genomics data.
翻译:集成神经网络是深度监督学习中一种广泛认可的提升模型性能、估计不确定性及增强鲁棒性的方法。然而,深度集成通常伴随高昂的计算成本和内存需求。此外,深度集成的效率与其成员间的多样性相关,而这对大规模、过参数化的深度神经网络而言具有挑战性。同时,集成学习在自监督或无监督表示学习中尚未得到广泛采用,且仍是一项艰巨任务。受这些挑战启发,我们提出了一种新颖的自监督训练机制,该机制利用独立子网络的集成,并辅以旨在鼓励多样性的新型损失函数。我们的方法能够高效构建具有高多样性的子模型集成,从而获得校准良好的模型不确定性估计,且相较于传统深度自监督集成,计算开销极低。为评估方法的有效性,我们在多项任务上进行了广泛实验,包括分布内泛化、分布外检测、数据集污染及半监督场景。结果表明,我们的方法显著提升了预测可靠性。该方法不仅实现了卓越的准确性,还增强了校准性能,在计算机视觉、自然语言处理及基因组学数据的各类自监督架构中均超越了基线水平。