Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to higher sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve chosen equivariance is a difficult problem and can result in a computationally expensive network during both training and inference. A recently proposed alternative towards equivariance that removes the architectural constraints is to use a simple canonicalization network that transforms the input to a canonical form before feeding it to an unconstrained prediction network. We show here that this approach can effectively be used to make a large pretrained network equivariant. However, we observe that the produced canonical orientations can be misaligned with those of the training distribution, hindering performance. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance. This significantly improves the robustness of these models to deterministic transformations of the data, such as rotations. We believe this equivariant adaptation of large pretrained models can help their domain-specific applications with known symmetry priors.
翻译:等变网络专门设计用于确保对一系列输入变换的一致性行为,从而提高样本效率并实现更准确、更稳健的预测。然而,重新设计主流深度神经网络架构的每个组件以实现特定的等变性是一个难题,且可能导致网络在训练和推理过程中计算成本高昂。近期提出的一种替代性等变方案消除了架构约束,通过使用一个简单的规范化网络,在将输入馈送到无约束预测网络之前将其变换为规范形式。我们在此证明,该方法能有效使大型预训练网络具备等变性。但我们观察到,生成的规范方向可能与训练分布的取向不一致,从而阻碍性能表现。通过利用数据集相关的先验知识指导规范化函数,我们能够在保持大型预训练模型性能的同时实现等变适配。这显著提高了这些模型对数据确定性变换(如旋转)的鲁棒性。我们相信这种大型预训练模型的等变适配,能够助力其具有已知对称先验的领域特定应用。