This paper studies separating invariants: mappings on $D$ dimensional domains which are invariant to an appropriate group action, and which separate orbits. The motivation for this study comes from the usefulness of separating invariants in proving universality of equivariant neural network architectures. We observe that in several cases the cardinality of separating invariants proposed in the machine learning literature is much larger than the dimension $D$. As a result, the theoretical universal constructions based on these separating invariants is unrealistically large. Our goal in this paper is to resolve this issue. We show that when a continuous family of semi-algebraic separating invariants is available, separation can be obtained by randomly selecting $2D+1 $ of these invariants. We apply this methodology to obtain an efficient scheme for computing separating invariants for several classical group actions which have been studied in the invariant learning literature. Examples include matrix multiplication actions on point clouds by permutations, rotations, and various other linear groups. Often the requirement of invariant separation is relaxed and only generic separation is required. In this case, we show that only $D+1$ invariants are required. More importantly, generic invariants are often significantly easier to compute, as we illustrate by discussing generic and full separation for weighted graphs. Finally we outline an approach for proving that separating invariants can be constructed also when the random parameters have finite precision.
翻译:本文研究分离不变量:即定义在$D$维域上且对特定群作用保持不变,并能分离轨道的映射。此研究的动机源于分离不变量在证明等变神经网络架构通用性中的价值。我们观察到,机器学习文献中提出的分离不变量基数通常远大于维度$D$,导致基于这些分离不变量的理论通用构造在现实中规模过大。本文旨在解决这一问题。我们证明,当存在连续族半代数分离不变量时,随机选取$2D+1$个不变量即可实现轨道分离。我们将此方法应用于不变学习文献中研究的若干经典群作用(包括点云在置换、旋转及各种线性群作用下的矩阵乘法),构建了高效的分离不变量计算方案。当仅需一般轨道分离时,我们证明只需$D+1$个不变量。更重要的是,通过讨论加权图的通用分离与完全分离,我们表明通用不变量通常更易于计算。最后,我们概述了一种在随机参数具有有限精度时仍能构造分离不变量的方法。