A growth curve model (GCM) aims to characterize how an outcome variable evolves, develops and grows as a function of time, along with other predictors. It provides a particularly useful framework to model growth trend in longitudinal data. However, the estimation and inference of GCM with a large number of response variables faces numerous challenges, and remains underdeveloped. In this article, we study the high-dimensional multivariate-response linear GCM, and develop the corresponding estimation and inference procedures. Our proposal is far from a straightforward extension, and involves several innovative components. Specifically, we introduce a Kronecker product structure, which allows us to effectively decompose a very large covariance matrix, and to pool the correlated samples to improve the estimation accuracy. We devise a highly non-trivial multi-step estimation approach to estimate the individual covariance components separately and effectively. We also develop rigorous statistical inference procedures to test both the global effects and the individual effects, and establish the size and power properties, as well as the proper false discovery control. We demonstrate the effectiveness of the new method through both intensive simulations, and the analysis of a longitudinal neuroimaging data for Alzheimer's disease.
翻译:增长曲线模型旨在刻画结果变量随时间及其他预测变量如何演变、发展及增长。该模型为纵向数据中的增长趋势建模提供了极为有用的框架。然而,当响应变量数量众多时,增长曲线模型的估计与推断面临诸多挑战,相关研究仍不完善。本文研究了高维多元响应线性增长曲线模型,并开发了相应的估计与推断方法。我们的方案远非直接扩展,而是包含多项创新内容。具体而言,我们引入克洛内克积结构,该结构使我们能够有效分解超大协方差矩阵,并汇集相关样本以提高估计精度。我们设计了一种高度非平凡的多步骤估计方法,用于分别且有效地估计各协方差分量。同时,我们开发了严格的统计推断程序,用于检验全局效应和个体效应,并建立了检验规模与功效特性以及虚假发现控制机制。通过密集的模拟实验和阿尔茨海默病纵向神经影像数据分析,我们验证了新方法的有效性。