Finding an approximation of the inverse of the covariance matrix, also known as precision matrix, of a random vector with empirical data is widely discussed in finance and engineering. In data-driven problems, empirical data may be ``contaminated''. This raises the question as to whether the approximate precision matrix is reliable from a statistical point of view. In this paper, we concentrate on a much-noticed sparse estimator of the precision matrix and investigate the issue from the perspective of distributional stability. Specifically, we derive an explicit local Lipschitz bound for the distance between the distributions of the sparse estimator under two different distributions (regarded as the true data distribution and the distribution of ``contaminated'' data). The distance is measured by the Kantorovich metric on the set of all probability measures on a matrix space. We also present analogous results for the standard estimators of the covariance matrix and its eigenvalues. Furthermore, we discuss two applications and conduct some numerical experiments.
翻译:在金融与工程领域,基于经验数据对随机向量的协方差矩阵逆(亦称为精度矩阵)进行近似估计是一个被广泛探讨的问题。在数据驱动的问题中,经验数据可能受到“污染”。这引发了从统计学角度评估近似精度矩阵是否可靠的问题。本文聚焦于一种备受关注的稀疏精度矩阵估计量,并从分布稳定性的角度对此问题进行研究。具体而言,我们推导了该稀疏估计量在两种不同分布(分别视为真实数据分布与“污染”数据分布)下分布之间距离的显式局部Lipschitz界。该距离通过定义在矩阵空间上所有概率测度集合的Kantorovich度量来衡量。同时,我们还给出了协方差矩阵及其特征值的标准估计量的类似结果。此外,我们讨论了两种应用场景并进行了数值实验。