In many fields of biomedical sciences, it is common that random variables are measured repeatedly across different subjects. In such a repeated measurement setting, dependence structures among random variables that are between subjects and within a subject may be different, and should be estimated differently. Ignoring this fact may lead to questionable or even erroneous scientific conclusions. In this paper, we study the problem of sparse and positive-definite estimation of between-subject and within-subject covariance matrices for high-dimensional repeated measurements. Our estimators are defined as solutions to convex optimization problems, which can be solved efficiently. We establish estimation error rate for our proposed estimators of the two target matrices, and demonstrate their favorable performance through theoretical analysis and comprehensive simulation studies. We further apply our methods to recover two covariance graphs of clinical variables from hemodialysis patients.
翻译:在生物医学科学的许多领域中,随机变量在不同受试者间被重复测量是常见现象。在这种重复测量设定下,受试者间与受试者内部的随机变量依赖结构可能存在差异,因此需要采用不同的估计方法。忽略这一事实可能导致可疑甚至错误的科学结论。本文针对高维重复测量数据,研究了受试者间与受试者内部协方差矩阵的稀疏正定估计问题。我们将估计量定义为凸优化问题的解,该类问题可被高效求解。我们建立了所提出的两个目标矩阵估计量的估计误差率,并通过理论分析和全面的模拟研究展示了其优越性能。进一步地,我们将所提方法应用于从血液透析患者临床变量中恢复两个协方差图。