Manifold data analysis is challenging due to the lack of parametric distributions on manifolds. To address this, we introduce a series of Riemannian radial distributions on Riemannian symmetric spaces. By utilizing the symmetry, we show that for many Riemannian radial distributions, the Riemannian $L^p$ center of mass is uniquely given by the location parameter, and the maximum likelihood estimator (MLE) of this parameter is given by an M-estimator. Therefore, these parametric distributions provide a promising tool for statistical modeling and algorithmic design. In addition, our paper develops a novel theory for parameter estimation and minimax optimality by integrating statistics, Riemannian geometry, and Lie theory. We demonstrate that the MLE achieves a convergence rate of root-$n$ up to logarithmic terms, where the rate is quantified by both the hellinger distance between distributions and geodesic distance between parameters. Then we derive a root-$n$ minimax lower bound for the parameter estimation rate, demonstrating the optimality of the MLE. Our minimax analysis is limited to the case of simply connected Riemannian symmetric spaces for technical reasons, but is still applicable to numerous applications. Finally, we extend our studies to Riemannian radial distributions with an unknown temperature parameter, and establish the convergence rate of the MLE. We also derive the model complexity of von Mises-Fisher distributions on spheres and discuss the effects of geometry in statistical estimation.
翻译:流形数据分析因缺乏流形上的参数化分布而颇具挑战性。为解决此问题,我们在黎曼对称空间上引入一系列黎曼径向分布。利用对称性,我们证明对于许多黎曼径向分布,黎曼$L^p$质心唯一地由位置参数给出,且该参数的最大似然估计(MLE)由一个M-估计量给出。因此,这些参数化分布为统计建模和算法设计提供了有前景的工具。此外,本文通过融合统计学、黎曼几何和李理论,发展了参数估计与极小极大最优性的新理论。我们证明MLE在忽略对数项后达到根号$n$的收敛速率,该速率由分布间的Hellinger距离和参数间的测地距离共同量化。随后推导出参数估计速率的一个根号$n$极小极大下界,证明了MLE的最优性。由于技术原因,我们的极小极大分析仅局限于单连通黎曼对称空间的情形,但仍适用于众多应用。最后,我们将研究扩展至具有未知温度参数的黎曼径向分布,并建立MLE的收敛速率。我们还推导了球面上von Mises-Fisher分布的模型复杂度,并讨论了几何在统计估计中的效应。