Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function

The Fisher information matrix (FIM) is a key quantity in statistics as it is required for example for evaluating asymptotic precisions of parameter estimates, for computing test statistics or asymptotic distributions in statistical testing, for evaluating post model selection inference results or optimality criteria in experimental designs. However its exact computation is often not trivial. In particular in many latent variable models, it is intricated due to the presence of unobserved variables. Therefore the observed FIM is usually considered in this context to estimate the FIM. Several methods have been proposed to approximate the observed FIM when it can not be evaluated analytically. Among the most frequently used approaches are Monte-Carlo methods or iterative algorithms derived from the missing information principle. All these methods require to compute second derivatives of the complete data log-likelihood which leads to some disadvantages from a computational point of view. In this paper, we present a new approach to estimate the FIM in latent variable model. The advantage of our method is that only the first derivatives of the log-likelihood is needed, contrary to other approaches based on the observed FIM. Indeed we consider the empirical estimate of the covariance matrix of the score. We prove that this estimate of the Fisher information matrix is unbiased, consistent and asymptotically Gaussian. Moreover we highlight that none of both estimates is better than the other in terms of asymptotic covariance matrix. When the proposed estimate can not be directly analytically evaluated, we present a stochastic approximation estimation algorithm to compute it. This algorithm provides this estimate of the FIM as a by-product of the parameter estimates. We emphasize that the proposed algorithm only requires to compute the first derivatives of the complete data log-likelihood with respect to the parameters. We prove that the estimation algorithm is consistent and asymptotically Gaussian when the number of iterations goes to infinity. We evaluate the finite sample size properties of the proposed estimate and of the observed FIM through simulation studies in linear mixed effects models and mixture models. We also investigate the convergence properties of the estimation algorithm in non linear mixed effects models. We compare the performances of the proposed algorithm to those of other existing methods.

翻译：Fisher信息矩阵（FIM）是统计学中的关键量，例如可用于评估参数估计的渐近精度、计算统计检验中的统计量或渐近分布、评估模型选择后的推断结果或实验设计中的最优性准则。然而，其精确计算往往并非易事。特别是在许多潜变量模型中，由于存在未观测变量，FIM的计算变得复杂。因此，在该背景下通常采用观测FIM来估计FIM。当观测FIM无法解析计算时，已有多种方法被提出用于近似估计。最常用的方法包括蒙特卡洛方法或基于缺失信息原理的迭代算法。这些方法均需计算完整数据对数似然的二阶导数，从计算角度而言存在一定劣势。本文提出了一种潜变量模型中估计FIM的新方法。与基于观测FIM的其他方法不同，本方法的优势在于仅需对数似然的一阶导数。具体而言，我们考虑得分函数协方差矩阵的经验估计，并证明该Fisher信息矩阵估计量是无偏、一致且渐近正态的。此外，我们强调在渐近协方差矩阵方面，两种估计量并无优劣之分。当所提估计量无法直接解析计算时，我们提出一种随机逼近估计算法对其进行计算。该算法将FIM的估计作为参数估计的副产品输出。需要指出的是，所提算法仅需计算完整数据对数似然关于参数的一阶导数。我们证明了当迭代次数趋于无穷时，该估计算法具有一致性和渐近正态性。通过线性混合效应模型和混合模型的仿真研究，我们评估了所提估计量及观测FIM的有限样本性质。同时，在非线性混合效应模型中探究了估计算法的收敛特性，并将所提算法的性能与其他现有方法进行了比较。