Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, Functional Principal Component Analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a potentially lower computational cost. This paper presents an empirical simulation study evaluating the behaviour of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.
翻译:健康研究中的纵向数据分析面临诸多挑战,包括稀疏且易错的测量、强烈的个体内相关性、数据缺失以及多样化的轨迹形态。虽然混合效应模型能有效应对这些挑战,但其本质仍是参数模型且可能产生较高的计算成本。相比之下,功能主成分分析是为规则密集函数数据开发的非参数方法,能以潜在更低的计算成本灵活描述时间轨迹。本文通过实证模拟研究,评估FPCA在处理稀疏易错重复测量数据时的表现,及其在不同缺失数据机制下相较于MM的稳健性。结果表明,除涉及高频系统性脱落的情景外,FPCA能很好地适应由脱落引起的随机缺失数据。与MM类似,FPCA在非随机缺失机制下同样失效。本研究将FPCA应用于基于人群老龄化队列的病例对照研究,描述临床痴呆前四种认知功能的轨迹变化,并与匹配对照组进行对比。结果显示未来痴呆病例的平均认知衰退在诊断前5至2.5年出现突发性拐点,其下降速度较匹配对照组呈现急剧加速。