Equivariant deep learning architectures exploit symmetries in learning problems to improve the sample efficiency of neural-network-based models and their ability to generalise. However, when modelling real-world data, learning problems are often not exactly equivariant, but only approximately. For example, when estimating the global temperature field from weather station observations, local topographical features like mountains break translation equivariance. In these scenarios, it is desirable to construct architectures that can flexibly depart from exact equivariance in a data-driven way. Current approaches to achieving this cannot usually be applied out-of-the-box to any architecture and symmetry group. In this paper, we develop a general approach to achieving this using existing equivariant architectures. Our approach is agnostic to both the choice of symmetry group and model architecture, making it widely applicable. We consider the use of approximately equivariant architectures in neural processes (NPs), a popular family of meta-learning models. We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments, showing that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.
翻译:等变深度学习架构利用学习问题中的对称性,提升基于神经网络的模型的样本效率及其泛化能力。然而,在对现实世界数据进行建模时,学习问题往往并非完全等变,而只是近似等变。例如,在根据气象站观测数据估算全球温度场时,山脉等地形特征会破坏平移等变性。在此类场景中,需要构建能够以数据驱动的方式灵活偏离严格等变性的架构。当前实现这一目标的方法通常无法直接应用于任意架构和对称群。本文提出一种通用方法,利用现有等变架构实现近似等变。该方法对对称群和模型架构的选择均保持不可知性,因而具有广泛适用性。我们将近似等变架构应用于神经过程——一类流行的元学习模型,并在多个合成与真实回归实验中验证了该方法的有效性。实验表明,近似等变神经过程模型的性能可同时超越非等变模型与严格等变模型。