The explicit incorporation of task-specific inductive biases through symmetry has emerged as a general design precept in the development of high-performance machine learning models. For example, group equivariant neural networks have demonstrated impressive performance across various domains and applications such as protein and drug design. A prevalent intuition about such models is that the integration of relevant symmetry results in enhanced generalization. Moreover, it is posited that when the data and/or the model may only exhibit $\textit{approximate}$ or $\textit{partial}$ symmetry, the optimal or best-performing model is one where the model symmetry aligns with the data symmetry. In this paper, we conduct a formal unified investigation of these intuitions. To begin, we present general quantitative bounds that demonstrate how models capturing task-specific symmetries lead to improved generalization. In fact, our results do not require the transformations to be finite or even form a group and can work with partial or approximate equivariance. Utilizing this quantification, we examine the more general question of model mis-specification i.e. when the model symmetries don't align with the data symmetries. We establish, for a given symmetry group, a quantitative comparison between the approximate/partial equivariance of the model and that of the data distribution, precisely connecting model equivariance error and data equivariance error. Our result delineates conditions under which the model equivariance error is optimal, thereby yielding the best-performing model for the given task and data.
翻译:通过对称性显式融入任务特定归纳偏置已成为高性能机器学习模型开发的通用设计准则。例如,群等变神经网络在蛋白质与药物设计等多个领域展现出卓越性能。关于此类模型存在一个普遍直觉:整合相关对称性可提升泛化能力。此外,当数据和/或模型仅呈现$\textit{近似}$或$\textit{部分}$对称性时,最优或性能最佳的模型应是其模型对称性与数据对称性相匹配的模型。本文对这些直觉进行了形式化的统一研究。首先,我们提出一般性量化界,证明捕捉任务特定对称性的模型能带来更好的泛化性。事实上,我们的结果并不要求变换是有限的甚至构成群,且适用于部分或近似等变性。借助这一量化框架,我们进一步探讨更普遍的模型误设问题——即模型对称性与数据对称性不匹配的情况。针对给定对称群,我们建立了模型近似/部分等变性与数据分布近似/部分等变性之间的量化比较,精确关联了模型等变性误差与数据等变性误差。我们的结果刻画了模型等变性误差达到最优的条件,从而为给定任务和数据产生性能最佳的模型。