We consider the problem of estimation of a covariance matrix for Gaussian data in a high dimensional setting. Existing approaches include maximum likelihood estimation under a pre-specified sparsity pattern, l_1-penalized loglikelihood optimization and ridge regularization of the sample covariance. We show that these three approaches can be addressed in an unified way, by considering the constrained optimization of an objective function that involves two suitably defined penalty terms. This unified procedure exploits the advantages of each individual approach, while bringing novelty in the combination of the three. We provide an efficient algorithm for the optimization of the regularized objective function and describe the relationship between the two penalty terms, thereby highlighting the importance of the joint application of the three methods. A simulation study shows how the sparse estimates of covariance matrices returned by the procedure are stable and accurate, both in low and high dimensional settings, and how their calculation is more efficient than existing approaches under a partially known sparsity pattern. An illustration on sonar data shows is presented for the identification of the covariance structure among signals bounced off a certain material. The method is implemented in the publicly available R package gicf.
翻译:本文研究高维高斯数据协方差矩阵的估计问题。现有方法包括基于预设稀疏模式的最大似然估计、l_1惩罚对数似然优化以及样本协方差矩阵的岭正则化。我们证明这三种方法可通过约束优化包含两个适当定义惩罚项的目标函数进行统一处理。该统一方法融合了各独立方法的优势,并在三者结合中实现了创新。我们为正则化目标函数的优化提供了高效算法,阐释了两个惩罚项之间的关系,从而凸显了三种方法联合应用的重要性。模拟研究表明:该方法返回的稀疏协方差矩阵估计在低维和高维场景下均具有稳定性和准确性,且在部分已知稀疏模式下的计算效率优于现有方法。通过声纳数据实例展示了该方法在识别特定材料反射信号间协方差结构的应用。本方法已在公开的R软件包gicf中实现。