Approximation and bounding techniques for the Fisher-Rao distances

The Fisher-Rao distance between two probability distributions of a statistical model is defined as the Riemannian geodesic distance induced by the Fisher information metric. In order to calculate the Fisher-Rao distance in closed-form, we need (1) to elicit a formula for the Fisher-Rao geodesics, and (2) to integrate the Fisher length element along those geodesics. We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances: First, we report generic upper bounds on Fisher-Rao distances based on closed-form 1D Fisher-Rao distances of submodels. Second, we describe several generic approximation schemes depending on whether the Fisher-Rao geodesics or pregeodesics are available in closed-form or not. In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation provided that Fisher-Rao pregeodesics and tight lower and upper bounds are available. Third, we consider the case of Fisher metrics being Hessian metrics, and report generic tight upper bounds on the Fisher-Rao distances using techniques of information geometry. Uniparametric and biparametric statistical models always have Fisher Hessian metrics, and in general a simple test allows to check whether the Fisher information matrix yields a Hessian metric or not. Fourth, we consider elliptical distribution families and show how to apply the above techniques to these models. We also propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics, or based on the Birkhoff/Hilbert projective cone distance. Last, we consider an alternative group-theoretic approach for statistical transformation models based on the notion of maximal invariant which yields insights on the structures of the Fisher-Rao distance formula which may be used fruitfully in applications.

翻译：统计模型中两个概率分布之间的Fisher-Rao距离定义为由Fisher信息度量诱导的黎曼测地线距离。为计算封闭形式的Fisher-Rao距离，需满足两个条件：(1)推导Fisher-Rao测地线的解析公式，以及(2)沿这些测地线对Fisher长度元素进行积分。本文针对Fisher-Rao距离提出了若干数值鲁棒的逼近与边界技术：首先，基于子模型的封闭形式一维Fisher-Rao距离，报告Fisher-Rao距离的通用上界。其次，根据Fisher-Rao测地线或预测地线是否可获取封闭形式，描述多种通用逼近方案。特别地，若已知Fisher-Rao预测地线及紧下界与上界，可获得一种保证逼近误差任意小的通用方法。第三，考虑Fisher度量为Hessian度量的情形，并利用信息几何技术报告Fisher-Rao距离的通用紧上界。单参数及双参数统计模型始终具有Fisher-Hessian度量，且通常可通过简单检验判断Fisher信息矩阵是否生成Hessian度量。第四，考虑椭圆分布族并展示如何将上述技术应用于此类模型。此外，基于两类曲线提出两种新距离：其一为作为Fisher-Rao测地线代理的曲线Fisher-Rao长度，其二为Birkhoff/Hilbert射影锥距离。最后，基于极大不变量的概念，为统计变换模型提出替代的群论方法，该方法可揭示Fisher-Rao距离公式的结构，从而有效应用于实际问题。