Equivariant deep learning architectures exploit symmetries in learning problems to improve the sample efficiency of neural-network-based models and their ability to generalise. However, when modelling real-world data, learning problems are often not exactly equivariant, but only approximately. For example, when estimating the global temperature field from weather station observations, local topographical features like mountains break translation equivariance. In these scenarios, it is desirable to construct architectures that can flexibly depart from exact equivariance in a data-driven way. In this paper, we develop a general approach to achieving this using existing equivariant architectures. Our approach is agnostic to both the choice of symmetry group and model architecture, making it widely applicable. We consider the use of approximately equivariant architectures in neural processes (NPs), a popular family of meta-learning models. We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments, demonstrating that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.
翻译:等变深度学习架构通过利用学习问题中的对称性,提高了基于神经网络的模型的样本效率及其泛化能力。然而,在对现实世界数据进行建模时,学习问题通常并非严格等变,而只是近似等变。例如,在根据气象站观测数据估算全球温度场时,山脉等局部地形特征会破坏平移等变性。在这些场景下,需要构建能够以数据驱动的方式灵活偏离严格等变性的架构。本文提出了一种利用现有等变架构实现这一目标的通用方法。该方法对对称群和模型架构的选择均保持不可知性,因而具有广泛适用性。我们探讨了在神经过程这一流行的元学习模型家族中使用近似等变架构的可能性。通过一系列合成与真实世界回归实验,我们证明了近似等变神经过程模型在性能上能够超越非等变及严格等变的对应模型。