Many modern datasets exhibit dependencies among observations as well as variables. This gives rise to the challenging problem of analyzing high-dimensional matrix-variate data with unknown dependence structures. To address this challenge, Kalaitzis et. al. (2013) proposed the Bigraphical Lasso (BiGLasso), an estimator for precision matrices of matrix-normals based on the Cartesian product of graphs. Subsequently, Greenewald, Zhou and Hero (GZH 2019) introduced a multiway tensor generalization of the BiGLasso estimator, known as the TeraLasso estimator. In this paper, we provide sharper rates of convergence in the Frobenius and operator norm for both BiGLasso and TeraLasso estimators for estimating inverse covariance matrices. This improves upon the rates presented in GZH 2019. In particular, (a) we strengthen the bounds for the relative errors in the operator and Frobenius norm by a factor of approximately $\log p$; (b) Crucially, this improvement allows for finite-sample estimation errors in both norms to be derived for the two-way Kronecker sum model. The two-way regime is important because it is the setting that is the most theoretically challenging, and simultaneously the most common in applications. Normality is not needed in our proofs; instead, we consider sub-gaussian ensembles and derive tight concentration of measure bounds, using tensor unfolding techniques. The proof techniques may be of independent interest.
翻译:许多现代数据集不仅包含变量间的依赖关系,还包含观测值间的依赖关系。这导致了一个具有挑战性的问题:如何分析具有未知依赖结构的高维矩阵变量数据。为应对这一挑战,Kalaitzis等人(2013)提出了双图Lasso(BiGLasso),这是一种基于图笛卡尔积的矩阵正态分布精度矩阵估计方法。随后,Greenewald, Zhou和Hero(GZH 2019)引入了BiGLasso估计量的多路张量推广,即TeraLasso估计量。本文针对BiGLasso和TeraLasso两种估计量,在Frobenius范数和算子范数下给出了更优的逆协方差矩阵收敛速率。这改进了GZH 2019论文中呈现的速率。具体而言:(a)我们将算子范数和Frobenius范数相对误差的界强化了约$\log p$倍;(b)关键的是,这一改进使得能够针对双路Kronecker和模型推导出两种范数下的有限样本估计误差。双路场景之所以重要,是因为它兼具理论上的最大挑战性与实际应用中的最普遍性。我们的证明无需正态性假设,而是考虑次高斯系综,并利用张量展开技术推导出紧致的测度集中界。这些证明技术可能具有独立的研究价值。