Symmetry is present throughout nature and continues to play an increasingly central role in physics and machine learning. Fundamental symmetries, such as Poincar\'{e} invariance, allow physical laws discovered in laboratories on Earth to be extrapolated to the farthest reaches of the universe. Symmetry is essential to achieving this extrapolatory power in machine learning applications. For example, translation invariance in image classification allows models with fewer parameters, such as convolutional neural networks, to be trained on smaller data sets and achieve state-of-the-art performance. In this paper, we provide a unifying theoretical and methodological framework for incorporating symmetry into machine learning models in three ways: 1. enforcing known symmetry when training a model; 2. discovering unknown symmetries of a given model or data set; and 3. promoting symmetry during training by learning a model that breaks symmetries within a user-specified group of candidates when there is sufficient evidence in the data. We show that these tasks can be cast within a common mathematical framework whose central object is the Lie derivative associated with fiber-linear Lie group actions on vector bundles. We extend and unify several existing results by showing that enforcing and discovering symmetry are linear-algebraic tasks that are dual with respect to the bilinear structure of the Lie derivative. We also propose a novel way to promote symmetry by introducing a class of convex regularization functions based on the Lie derivative and nuclear norm relaxation to penalize symmetry breaking during training of machine learning models. We explain how these ideas can be applied to a wide range of machine learning models including basis function regression, dynamical systems discovery, multilayer perceptrons, and neural networks acting on spatial fields such as images.
翻译:对称性普遍存在于自然界中,并且在物理学和机器学习中持续扮演日益核心的角色。基本对称性,例如庞加莱不变性,使得在地球实验室中发现的物理定律能够被外推到宇宙最遥远的角落。对称性对于在机器学习应用中实现这种外推能力至关重要。例如,图像分类中的平移不变性允许使用参数更少的模型(如卷积神经网络)在更小的数据集上进行训练,同时达到最先进的性能。在本文中,我们提供了一个统一的理论与方法学框架,通过三种方式将对称性融入机器学习模型:1. 在训练模型时强制已知对称性;2. 发现给定模型或数据集的未知对称性;3. 在训练过程中通过学习一个模型来促进对称性,该模型在数据中有足够证据时,会打破用户指定的候选对称性群组内的对称性。我们证明这些任务可以被纳入一个共同的数学框架之中,该框架的核心对象是与向量丛上的纤维线性李群作用相关联的李导数。我们通过展示强制和发现对称性是与李导数的双线性结构对偶的线性代数任务,来扩展并统一了若干现有结果。我们还提出了一种新颖的对称性促进方法,通过引入一类基于李导数和核范数松弛的凸正则化函数,在机器学习模型训练过程中惩罚对称性破缺。我们解释了如何将这些思想应用于广泛的机器学习模型,包括基函数回归、动力系统发现、多层感知器以及作用于图像等空间场上的神经网络。