We present a novel framework to overcome the limitations of equivariant architectures in learning functions with group symmetries. In contrary to equivariant architectures, we use an arbitrary base model such as an MLP or a transformer and symmetrize it to be equivariant to the given group by employing a small equivariant network that parameterizes the probabilistic distribution underlying the symmetrization. The distribution is end-to-end trained with the base model which can maximize performance while reducing sample complexity of symmetrization. We show that this approach ensures not only equivariance to given group but also universal approximation capability in expectation. We implement our method on various base models, including patch-based transformers that can be initialized from pretrained vision transformers, and test them for a wide range of symmetry groups including permutation and Euclidean groups and their combinations. Empirical tests show competitive results against tailored equivariant architectures, suggesting the potential for learning equivariant functions for diverse groups using a non-equivariant universal base architecture. We further show evidence of enhanced learning in symmetric modalities, like graphs, when pretrained from non-symmetric modalities, like vision. Code is available at https://github.com/jw9730/lps.
翻译:我们提出了一种新颖框架,以克服具有群对称性的函数学习中等变架构的局限性。与等变架构相反,我们使用任意基础模型(如MLP或Transformer),并通过一个小型等变网络对其参数化对称化过程所基于的概率分布,从而实现对称化以使其对给定群具有等变性。该分布与基础模型进行端到端训练,可在降低对称化样本复杂度的同时最大化性能。我们证明,该方法不仅能确保对给定群的等变性,还能在期望意义上实现通用逼近能力。我们在多种基础模型上实现了该方法,包括可从预训练视觉Transformer初始化的基于补丁的Transformer,并针对置换群、欧几里得群及其组合等广泛对称群进行了测试。实验结果表明,该方法与定制化等变架构相比具有竞争力,这预示着使用非等变通用基础架构学习面向不同群的等变函数的潜力。我们进一步展示了从非对称模态(如视觉)预训练时,对对称模态(如图)的学习能力可得到增强。代码可在https://github.com/jw9730/lps获取。