Neural network ensembles have been effectively used to improve generalization by combining the predictions of multiple independently trained models. However, the growing scale and complexity of deep neural networks have led to these methods becoming prohibitively expensive and time consuming to implement. Low-cost ensemble methods have become increasingly important as they can alleviate the need to train multiple models from scratch while retaining the generalization benefits that traditional ensemble learning methods afford. This dissertation introduces and formalizes a low-cost framework for constructing Subnetwork Ensembles, where a collection of child networks are formed by sampling, perturbing, and optimizing subnetworks from a trained parent model. We explore several distinct methodologies for generating child networks and we evaluate their efficacy through a variety of ablation studies and established benchmarks. Our findings reveal that this approach can greatly improve training efficiency, parametric utilization, and generalization performance while minimizing computational cost. Subnetwork Ensembles offer a compelling framework for exploring how we can build better systems by leveraging the unrealized potential of deep neural networks.
翻译:神经网络集成通过组合多个独立训练模型的预测,已被有效用于提升泛化性能。然而,随着深度神经网络的规模与复杂度日益增长,这些方法在实施时变得过于昂贵且耗时。低成本集成方法因此变得愈发重要,因其既能避免从零训练多个模型,又能保留传统集成学习方法带来的泛化优势。本文提出并形式化了一种构建子网络集成的低成本框架——通过从已训练的父模型中采样、扰动和优化子网络,形成一组子网络集合。我们探索了多种生成子网络的独特方法,并通过一系列消融实验与标准基准评估其有效性。研究结果表明,该方法能在最小化计算成本的同时,显著提升训练效率、参数利用率及泛化性能。子网络集成提供了一种极具吸引力的框架,用于探索如何通过挖掘深度神经网络的未实现潜力来构建更优系统。