Since its use in the Lottery Ticket Hypothesis, iterative magnitude pruning (IMP) has become a popular method for extracting sparse subnetworks that can be trained to high performance. Despite this, the underlying nature of IMP's general success remains unclear. One possibility is that IMP is especially capable of extracting and maintaining strong inductive biases. In support of this, recent work has shown that applying IMP to fully connected neural networks (FCNs) leads to the emergence of local receptive fields (RFs), an architectural feature present in mammalian visual cortex and convolutional neural networks. The question of how IMP is able to do this remains unanswered. Inspired by results showing that training FCNs on synthetic images with highly non-Gaussian statistics (e.g., sharp edges) is sufficient to drive the formation of local RFs, we hypothesize that IMP iteratively increases the non-Gaussian statistics present in the representations of FCNs, creating a feedback loop that enhances localization. We develop a new method for measuring the effect of individual weights on the statistics of the FCN representations ("cavity method"), which allows us to find evidence in support of this hypothesis. Our work, which is the first to study the effect IMP has on the statistics of the representations of neural networks, sheds parsimonious light on one way in which IMP can drive the formation of strong inductive biases.
翻译:自其在彩票假设中被使用以来,迭代幅度剪枝已成为一种提取稀疏子网络的流行方法,这些子网络能够被训练至高性能。尽管如此,IMP普遍成功的根本原因仍不明确。一种可能性是IMP特别擅长提取并保持强大的归纳偏置。支持这一观点的是,近期研究表明,将IMP应用于全连接神经网络会导致局部感受野的出现,这是哺乳动物视觉皮层和卷积神经网络中存在的一种架构特征。IMP如何实现这一点的问题仍未得到解答。受先前结果的启发——这些结果表明,在具有高度非高斯统计特性(例如锐利边缘)的合成图像上训练FCN足以驱动局部RF的形成——我们假设,IMP迭代地增加了FCN表征中存在的非高斯统计特性,从而创建一个增强局部化的反馈循环。我们开发了一种新方法,用于测量单个权重对FCN表征统计特性的影响(“空腔法”),这使我们能够找到支持该假设的证据。我们的工作是首个研究IMP对神经网络表征统计特性影响的研究,它以一种简约的方式阐明了IMP能够驱动强归纳偏置形成的一种途径。