Latent representations are used extensively for downstream tasks, such as visualization, interpolation or feature extraction of deep learning models. Invariant and equivariant neural networks are powerful and well-established models for enforcing inductive biases. In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using latent representations. We show how not accounting for the inductive biases leads to decreased performance on downstream tasks, and vice versa, how accounting for inductive biases can be done effectively by using an invariant projection of the latent representations. We propose principles for how to choose such a projection, and show the impact of using these principles in two common examples: First, we study a permutation equivariant variational auto-encoder trained for molecule graph generation; here we show that invariant projections can be designed that incur no loss of information in the resulting invariant representation. Next, we study a rotation-equivariant representation used for image classification. Here, we illustrate how random invariant projections can be used to obtain an invariant representation with a high degree of retained information. In both cases, the analysis of invariant latent representations proves superior to their equivariant counterparts. Finally, we illustrate that the phenomena documented here for equivariant neural networks have counterparts in standard neural networks where invariance is encouraged via augmentation. Thus, while these ambiguities may be known by experienced developers of equivariant models, we make both the knowledge as well as effective tools to handle the ambiguities available to the broader community.
翻译:潜在表示被广泛用于下游任务,例如深度学习模型的可视化、插值或特征提取。等变与不变神经网络是强制引入归纳偏置的强大且成熟的模型。在本文中,我们证明使用等变模型施加的归纳偏置在使用潜在表示时也必须加以考虑。我们展示了忽略归纳偏置如何导致下游任务性能下降,反之,通过使用潜在表示的不变投影可以有效考虑归纳偏置。我们提出了选择此类投影的原则,并通过两个常见实例展示了应用这些原则的影响:首先,研究了一个用于分子图生成的置换等变变分自编码器;在此,我们证明可以设计出在所得不变表示中不损失信息的不变投影。其次,研究了一个用于图像分类的旋转等变表示。在此,我们说明了如何使用随机不变投影来获得保留高信息量的不变表示。在两种情况下,对不变潜在表示的分析都优于其等变对应物。最后,我们证明,本文记录的关于等变神经网络的现象,在通过数据增强鼓励不变性的标准神经网络中也存在对应。因此,尽管等变模型的经验开发者可能了解这些歧义,但我们向更广泛的社区提供了这些知识以及处理这些歧义的有效工具。