Permutation invariance is among the most common symmetry that can be exploited to simplify complex problems in machine learning (ML). There has been a tremendous surge of research activities in building permutation invariant ML architectures. However, less attention is given to: (1) how to statistically test for permutation invariance of coordinates in a random vector where the dimension is allowed to grow with the sample size; (2) how to leverage permutation invariance in estimation problems and how does it help reduce dimensions. In this paper, we take a step back and examine these questions in several fundamental problems: (i) testing the assumption of permutation invariance of multivariate distributions; (ii) estimating permutation invariant densities; (iii) analyzing the metric entropy of permutation invariant function classes and compare them with their counterparts without imposing permutation invariance; (iv) deriving an embedding of permutation invariant reproducing kernel Hilbert spaces for efficient computation. In particular, our methods for (i) and (iv) are based on a sorting trick and (ii) is based on an averaging trick. These tricks substantially simplify the exploitation of permutation invariance.
翻译:置换不变性是机器学习中可被用于简化复杂问题的最常见对称性之一。近年来,构建置换不变性机器学习架构的研究活动激增。然而,以下问题受到的关注相对较少:(1)如何对允许维度随样本量增长的随机向量坐标的置换不变性进行统计检验;(2)如何在估计问题中利用置换不变性,以及它如何帮助降维。在本文中,我们退一步审视这些基础问题:(i)检验多元分布置换不变性的假设;(ii)估计置换不变密度;(iii)分析置换不变函数类的度量熵,并将其与未施加置换不变性的对应函数类进行比较;(iv)推导置换不变再生核希尔伯特空间的嵌入以实现高效计算。特别地,我们针对(i)和(iv)的方法基于排序技巧,针对(ii)的方法基于平均化技巧。这些技巧显著简化了对置换不变性的利用。