Equivariant neural networks have shown improved performance, expressiveness and sample complexity on symmetrical domains. But for some specific symmetries, representations, and choice of coordinates, the most common point-wise activations, such as ReLU, are not equivariant, hence they cannot be employed in the design of equivariant neural networks. The theorem we present in this paper describes all possible combinations of finite-dimensional representations, choice of coordinates and point-wise activations to obtain an exactly equivariant layer, generalizing and strengthening existing characterizations. Notable cases of practical relevance are discussed as corollaries. Indeed, we prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups. Then, we discuss implications of our findings when applied to important instances of exactly equivariant networks. First, we completely characterize permutation equivariant networks such as Invariant Graph Networks with point-wise nonlinearities and their geometric counterparts, highlighting a plethora of models whose expressive power and performance are still unknown. Second, we show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.
翻译:等变神经网络在对称域上表现出更优的性能、表达能力和样本复杂度。但对于某些特定的对称性、表示以及坐标选择,最常用的逐点激活函数(如ReLU)并不具有等变性,因此无法用于等变神经网络的设计。本文提出的定理描述了在有限维表示、坐标选择与逐点激活函数的所有可能组合下获得严格等变层的条件,推广并强化了已有的刻画结论。作为推论,我们讨论了若干具有实际意义的重要情形。事实上,我们证明了旋转等变网络只能是旋转不变的(正如任何关于连通紧群等变的网络一样)。随后,我们探讨了这些发现应用于严格等变网络重要实例时的意义:其一,我们完整刻画了带逐点非线性函数的置换等变网络(如不变图网络)及其几何对应模型,揭示了众多表达能力和性能尚属未知的模型族;其二,我们证明了可解等变卷积神经网络的特征空间是平凡表示。