The Strong Lottery Ticket Hypothesis (SLTH) states that large, randomly initialized neural networks contain sparse subnetworks capable of approximating a target function at initialization without training, suggesting that pruning alone is sufficient. Pruning methods are typically classified as unstructured, where individual weights can be removed from the network, and structured, where parameters are removed according to specific patterns, as in neuron pruning. Existing theoretical results supporting the SLTH rely almost exclusively on unstructured pruning, showing that logarithmic overparameterization suffices to approximate simple target networks. In contrast, neuron pruning has received limited theoretical attention, despite its practical appeal for direct hardware speedups. In this work, we consider the problem of approximating a single bias-free ReLU neuron by pruning hidden units of a randomly initialized two-layer ReLU network, effectively isolating the intrinsic limitations of neuron pruning. We show that achieving an $\varepsilon$-approximation requires a starting network size of $Ω(1/\varepsilon)$ for neuron pruning, whereas weight pruning succeeds with only $O(\log(1/\varepsilon))$ hidden units, revealing an exponential separation between the two approaches.
翻译:强彩票假说(Strong Lottery Ticket Hypothesis, SLTH)指出,大型随机初始化的神经网络在无需训练的条件下,其初始状态中即包含能够逼近目标函数的稀疏子网络,这表明仅通过剪枝即可实现目标逼近。剪枝方法通常分为两类:非结构化剪枝(可移除网络中单个权重)和结构化剪枝(如神经元剪枝,按特定模式移除参数)。现有支持SLTH的理论结果几乎完全依赖非结构化剪枝,表明对数级别的超参数化足以逼近简单目标网络。相比之下,尽管神经元剪枝因能直接提升硬件速度而具有实际应用价值,其理论关注却十分有限。本研究考虑通过剪枝随机初始化的双层ReLU网络中的隐藏单元来逼近单个无偏置ReLU神经元的问题,从而有效隔离神经元剪枝的内在局限性。研究表明,为实现ε-逼近,神经元剪枝要求初始网络规模达到Ω(1/ε),而权重剪枝仅需O(log(1/ε))个隐藏单元即可成功,揭示了两类方法之间存在指数级分离。