The Fisher-Rao distance between two probability distributions of a statistical model is defined as the Riemannian geodesic distance induced by the Fisher information metric. In order to calculate the Fisher-Rao distance in closed-form, we need (1) to elicit a formula for the Fisher-Rao geodesics, and (2) to integrate the Fisher length element along those geodesics. We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances: First, we report generic upper bounds on Fisher-Rao distances based on closed-form 1D Fisher-Rao distances of submodels. Second, we describe several generic approximation schemes depending on whether the Fisher-Rao geodesics or pregeodesics are available in closed-form or not. In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation provided that Fisher-Rao pregeodesics and tight lower and upper bounds are available. Third, we consider the case of Fisher metrics being Hessian metrics, and report generic tight upper bounds on the Fisher-Rao distances using techniques of information geometry. Uniparametric and biparametric statistical models always have Fisher Hessian metrics, and in general a simple test allows to check whether the Fisher information matrix yields a Hessian metric or not. Fourth, we consider elliptical distribution families and show how to apply the above techniques to these models. We also propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics, or based on the Birkhoff/Hilbert projective cone distance. Last, we consider an alternative group-theoretic approach for statistical transformation models based on the notion of maximal invariant which yields insights on the structures of the Fisher-Rao distance formula which may be used fruitfully in applications.
翻译:统计模型中两个概率分布间的Fisher-Rao距离定义为由Fisher信息度量诱导的黎曼测地距离。为以闭式形式计算Fisher-Rao距离,我们需要:(1)推导Fisher-Rao测地线的计算公式,(2)沿这些测地线对Fisher长度元进行积分。我们提出了若干数值稳健的Fisher-Rao距离逼近与定界技术:首先,基于子模型一维Fisher-Rao距离的闭式解,我们给出了Fisher-Rao距离的通用上界。其次,根据Fisher-Rao测地线或前测地线是否存在闭式解,我们描述了多种通用逼近方案。特别地,在可获得Fisher-Rao前测地线及紧致上下界的情况下,我们提出了一种能保证逼近误差任意小的通用方法。第三,针对Fisher度量为Hessian度量的情形,我们利用信息几何技术给出了Fisher-Rao距离的通用紧致上界。单参数与双参数统计模型的Fisher度量恒为Hessian度量,而一般情况下可通过简单检验判断Fisher信息矩阵是否构成Hessian度量。第四,我们研究椭圆分布族并展示如何将上述技术应用于此类模型。同时提出了两种新型距离:一种基于作为Fisher-Rao测地线代理的曲线Fisher-Rao长度,另一种基于Birkhoff/Hilbert射影锥距离。最后,针对统计变换模型,我们基于极大不变量的概念提出了一种替代性的群论方法,该方法能揭示Fisher-Rao距离公式的结构特征,在实际应用中具有重要价值。