Since its use in the Lottery Ticket Hypothesis, iterative magnitude pruning (IMP) has become a popular method for extracting sparse subnetworks that can be trained to high performance. Despite this, the underlying nature of IMP's general success remains unclear. One possibility is that IMP is especially capable of extracting and maintaining strong inductive biases. In support of this, recent work has shown that applying IMP to fully connected neural networks (FCNs) leads to the emergence of local receptive fields (RFs), an architectural feature present in mammalian visual cortex and convolutional neural networks. The question of how IMP is able to do this remains unanswered. Inspired by results showing that training FCNs on synthetic images with highly non-Gaussian statistics (e.g., sharp edges) is sufficient to drive the formation of local RFs, we hypothesize that IMP iteratively maximizes the non-Gaussian statistics present in the representations of FCNs, creating a feedback loop that enhances localization. We develop a new method for measuring the effect of individual weights on the statistics of the FCN representations ("cavity method"), which allows us to find evidence in support of this hypothesis. Our work, which is the first to study the effect IMP has on the representations of neural networks, sheds parsimonious light one way in which IMP can drive the formation of strong inductive biases.
翻译:自其在彩票假说中被应用以来,迭代幅度剪枝已成为一种流行方法,用于提取能够训练至高性能的稀疏子网络。尽管如此,迭代幅度剪枝普遍成功的根本原因仍不明确。一种可能性是,迭代幅度剪枝特别擅长提取并保持强大的归纳偏置。支持这一观点的是,近期研究表明,将迭代幅度剪枝应用于全连接神经网络会导致局部感受野的出现,这是哺乳动物视觉皮层和卷积神经网络中存在的一种架构特征。迭代幅度剪枝如何实现这一点的问题仍未得到解答。受相关研究结果的启发——这些结果表明,在具有高度非高斯统计特性(例如锐利边缘)的合成图像上训练全连接神经网络足以驱动局部感受野的形成——我们假设迭代幅度剪枝会迭代地最大化全连接神经网络表征中存在的非高斯统计特性,从而创建一个增强局部化的反馈循环。我们开发了一种新方法来测量单个权重对全连接神经网络表征统计特性的影响(“空腔方法”),这使我们能够找到支持这一假设的证据。我们的工作是首个研究迭代幅度剪枝对神经网络表征影响的研究,它以一种简约的方式阐明了迭代幅度剪枝能够驱动强大归纳偏置形成的一种途径。