In many fields of biomedical sciences, it is common that random variables are measured repeatedly across different subjects. In such a repeated measurement setting, dependence structures among random variables that are between subjects and within a subject may differ and should be estimated differently. Ignoring this fact may lead to questionable or even erroneous scientific conclusions. In this paper, we study the problem of sparse and positive-definite estimation of between-subject and within-subject covariance matrices for high-dimensional repeated measurements. Our estimators are defined as solutions to convex optimization problems that can be solved efficiently. We establish estimation error rates for our proposed estimators of the two target matrices, and demonstrate their favorable performance through theoretical analysis and comprehensive simulation studies. We further apply our methods to recover two covariance graphs of clinical variables from hemodialysis patients.
翻译:在生物医学科学诸多领域中,随机变量在不同受试者间的重复测量是常见现象。在重复测量设置下,受试者间与受试者内部随机变量的相依结构可能存在差异,需要采用不同的估计方法。忽略该差异可能导致可疑甚至错误的科学结论。本文针对高维重复测量数据,研究受试者间与受试者内部协方差矩阵的稀疏正定估计问题。我们将估计量定义为可高效求解的凸优化问题的解,建立了两种目标矩阵估计量的估计误差速率,并通过理论分析与全面模拟研究验证了其优越性能。进一步,我们将所提方法应用于血液透析患者临床变量的两个协方差图重构。