The Multi-Prize Lottery Ticket Hypothesis posits that randomly initialized neural networks contain several subnetworks that achieve comparable accuracy to fully trained models of the same architecture. However, current methods require that the network is sufficiently overparameterized. In this work, we propose a modification to two state-of-the-art algorithms (Edge-Popup and Biprop) that finds high-accuracy subnetworks with no additional storage cost or scaling. The algorithm, Iterative Weight Recycling, identifies subsets of important weights within a randomly initialized network for intra-layer reuse. Empirically we show improvements on smaller network architectures and higher prune rates, finding that model sparsity can be increased through the "recycling" of existing weights. In addition to Iterative Weight Recycling, we complement the Multi-Prize Lottery Ticket Hypothesis with a reciprocal finding: high-accuracy, randomly initialized subnetwork's produce diverse masks, despite being generated with the same hyperparameter's and pruning strategy. We explore the landscapes of these masks, which show high variability.
翻译:多重彩票假说认为,随机初始化的神经网络包含多个子网络,这些子网络能达到与同架构完全训练模型相当的精度。然而现有方法要求网络具有充分过参数化特性。本文提出对两种前沿算法(Edge-Popup与Biprop)的改进方案,能在不增加存储成本或缩放的前提下获得高精度子网络。该算法——迭代权重复用——在随机初始化网络中识别重要权重组用于层内复用。实验表明,该方法在更小规模网络架构与更高剪枝率下均取得改进,发现可通过"复用"现有权重提升模型稀疏度。除迭代权重复用外,我们为多重彩票假说补充了一项互反发现:尽管采用相同超参数与剪枝策略生成,高精度随机初始子网络会产生多样化掩码。我们探索了这些具有高度变异性的掩码景观。