Model stitching (Lenc & Vedaldi 2015) is a compelling methodology to compare different neural network representations, because it allows us to measure to what degree they may be interchanged. We expand on a previous work from Bansal, Nakkiran & Barak which used model stitching to compare representations of the same shapes learned by differently seeded and/or trained neural networks of the same architecture. Our contribution enables us to compare the representations learned by layers with different shapes from neural networks with different architectures. We subsequently reveal unexpected behavior of model stitching. Namely, we find that stitching, based on convolutions, for small ResNets, can reach high accuracy if those layers come later in the first (sender) network than in the second (receiver), even if those layers are far apart.
翻译:模型拼接(Lenc & Vedaldi 2015)是一种引人注目的方法论,用于比较不同的神经网络表示,因为它允许我们衡量它们在多大程度上可以互换。我们扩展了Bansal、Nakkiran和Barak先前的工作,他们使用模型拼接来比较通过不同随机种子和/或训练方式学习的相同架构神经网络中相同形状的表示。我们的贡献使我们能够比较来自不同架构神经网络中不同形状层所学习的表示。随后,我们揭示了模型拼接的意外行为。具体而言,我们发现,对于小型ResNet,基于卷积的拼接如果这些层在第一个(发送者)网络中出现晚于第二个(接收者)网络,即使这些层相距甚远,也能达到高精度。