Understanding associations between paired high-dimensional longitudinal datasets is a fundamental yet challenging problem that arises across scientific domains, including longitudinal multi-omic studies. The difficulty stems from the complex, time-varying cross-covariance structure coupled with high dimensionality, which complicates both model formulation and statistical estimation. To address these challenges, we propose a new framework, termed Functional-Aggregated Cross-covariance Decomposition (FACD), tailored for canonical cross-covariance analysis between paired high-dimensional longitudinal datasets through a statistically efficient and theoretically grounded procedure. Unlike existing methods that are often limited to low-dimensional data or rely on explicit parametric modeling of temporal dynamics, FACD adaptively learns temporal structure by aggregating signals across features and naturally accommodates variable selection to identify the most relevant features associated across datasets. We establish statistical guarantees for FACD and demonstrate its advantages over existing approaches through extensive simulation studies. Finally, we apply FACD to a longitudinal multi-omic human study, revealing blood molecules with time-varying associations across omic layers during acute exercise.
翻译:理解配对高维纵向数据集之间的关联是跨科学领域(包括纵向多组学研究)中一个基础性且具有挑战性的问题。其困难源于复杂且时变的交叉协方差结构以及高维度特性,这使得模型构建与统计估计均变得复杂。为应对这些挑战,我们提出了一种新框架——功能聚合交叉协方差分解(FACD),该框架通过统计高效且理论完备的程序,专门用于配对高维纵向数据集间的典型交叉协方差分析。与现有方法通常局限于低维数据或依赖对时间动态的显式参数化建模不同,FACD通过聚合跨特征的信号自适应地学习时间结构,并自然地支持变量选择以识别跨数据集关联最相关的特征。我们为FACD建立了统计保证,并通过大量模拟研究证明了其相对于现有方法的优势。最后,我们将FACD应用于一项纵向多组学人类研究,揭示了在急性运动期间跨组学层具有时变关联的血液分子。