Feature alignment methods are used in many scientific disciplines for data pooling, annotation, and comparison. As an instance of a permutation learning problem, feature alignment presents significant statistical and computational challenges. In this work, we propose the covariance alignment model to study and compare various alignment methods and establish a minimax lower bound for covariance alignment that has a non-standard dimension scaling because of the presence of a nuisance parameter. This lower bound is in fact minimax optimal and is achieved by a natural quasi MLE. However, this estimator involves a search over all permutations which is computationally infeasible even when the problem has moderate size. To overcome this limitation, we show that the celebrated Gromov-Wasserstein algorithm from optimal transport which is more amenable to fast implementation even on large-scale problems is also minimax optimal. These results give the first statistical justification for the deployment of the Gromov-Wasserstein algorithm in practice.
翻译:特征对齐方法广泛应用于许多科学领域的数据合并、标注与比较。作为排列学习问题的一个实例,特征对齐面临着显著的统计与计算挑战。本文提出协方差对齐模型,以研究和比较各种对齐方法,并建立了协方差对齐的极小化最大下界。由于含有冗余参数,该下界具有非标准的维度缩放特性。事实上,该下界是极小化最大最优的,并且可由自然的拟最大似然估计达到。然而,该估计量需要搜索所有排列,即使在问题规模适中时也难以实现计算。为克服这一局限,我们证明了来自最优传输的著名Gromov-Wasserstein算法(该算法更易于在大规模问题上高效实现)同样具有极小化最大最优性。这些结果为Gromov-Wasserstein算法在实际部署中提供了首个统计理论依据。