Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, Functional Principal Component Analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a lower computational cost. This paper presents an empirical simulation study evaluating the behaviour of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.
翻译:健康研究中的纵向数据分析面临诸多挑战,包括稀疏且易出错的测量值、强个体内相关性、数据缺失以及多样化的轨迹形态。尽管混合效应模型(MM)能有效应对这些挑战,但其仍属于参数模型,且可能产生较高的计算成本。相比之下,功能主成分分析(FPCA)是一种专为规则密集功能型数据开发非参数方法,能以较低计算成本灵活描述时间轨迹。本文通过实证模拟研究,评估了FPCA在处理稀疏且易出错重复测量数据时的表现,及其在不同缺失数据机制下相对于MM的稳健性。结果表明,除最频繁和系统性缺失场景外,FPCA在随机缺失数据(因脱落导致)场景中表现良好。与MM类似,FPCA在非随机缺失机制下失效。本研究将FPCA应用于描述巢式于基于人群的老年队列中的病例对照研究,分析临床痴呆前四种认知功能的轨迹,并将其与匹配对照组进行对比。未来痴呆病例的平均认知衰退在诊断前5至2.5年呈现突然加速偏离匹配对照组的特征。