Finding an approximation of the inverse of the covariance matrix, also known as precision matrix, of a random vector with empirical data is widely discussed in finance and engineering. In data-driven problems, empirical data may be ``contaminated''. This raises the question as to whether the approximate precision matrix is reliable from a statistical point of view. In this paper, we concentrate on a much-noticed sparse estimator of the precision matrix and investigate the issue from the perspective of distributional stability. Specifically, we derive an explicit local Lipschitz bound for the distance between the distributions of the sparse estimator under two different distributions (regarded as the true data distribution and the distribution of ``contaminated'' data). The distance is measured by the Kantorovich metric on the set of all probability measures on a matrix space. We also present analogous results for the standard estimators of the covariance matrix and its eigenvalues. Furthermore, we discuss several applications and conduct some numerical experiments.
翻译:在金融与工程领域,利用经验数据对随机向量的协方差矩阵逆(亦称为精度矩阵)进行近似估计是一个被广泛探讨的问题。在数据驱动的问题中,经验数据可能受到“污染”。这引发了一个疑问:从统计学的角度来看,近似得到的精度矩阵是否可靠?本文聚焦于一种备受关注的精度矩阵稀疏估计器,并从分布稳定性的角度对此问题进行研究。具体而言,我们推导了该稀疏估计器在两种不同分布(分别视为真实数据分布与“污染”数据分布)下其分布之间距离的一个显式局部Lipschitz界。该距离通过定义在矩阵空间上所有概率测度集合上的Kantorovich度量来衡量。同时,我们也给出了协方差矩阵及其特征值的标准估计器的类似结果。此外,本文还讨论了若干应用并进行了数值实验。