Many data symmetries can be described in terms of group equivariance and the most common way of encoding group equivariances in neural networks is by building linear layers that are group equivariant. In this work we investigate whether equivariance of a network implies that all layers are equivariant. On the theoretical side we find cases where equivariance implies layerwise equivariance, but also demonstrate that this is not the case generally. Nevertheless, we conjecture that CNNs that are trained to be equivariant will exhibit layerwise equivariance and explain how this conjecture is a weaker version of the recent permutation conjecture by Entezari et al. [2022]. We perform quantitative experiments with VGG-nets on CIFAR10 and qualitative experiments with ResNets on ImageNet to illustrate and support our theoretical findings. These experiments are not only of interest for understanding how group equivariance is encoded in ReLU-networks, but they also give a new perspective on Entezari et al.'s permutation conjecture as we find that it is typically easier to merge a network with a group-transformed version of itself than merging two different networks.
翻译:许多数据对称性可以用群等变性来描述,而在神经网络中编码群等变性的最常见方式是构建群等变线性层。本文探讨网络等变性是否意味着所有层都是等变的。在理论层面,我们发现存在等变性隐含逐层等变性的情况,但也证明这在一般情况下并不成立。尽管如此,我们推测经过等变性训练的卷积神经网络会表现出逐层等变性,并解释该推测是Entezari等人[2022]近期排列推测的弱化版本。我们通过在CIFAR10数据集上对VGG网络的定量实验,以及在ImageNet数据集上对残差网络的定性实验,来阐释并支持我们的理论发现。这些实验不仅有助于理解群等变性在ReLU网络中的编码机制,还为Entezari等人的排列推测提供了新视角——我们发现将网络与其群变换版本合并通常比合并两个不同网络更容易。