Recently, the increasing availability of repeated measurements in biomedical studies has motivated the development of several statistical methods for the dynamic prediction of survival in settings where a large (potentially high-dimensional) number of longitudinal covariates is available. These methods differ in both how they model the longitudinal covariates trajectories, and how they specify the relationship between the longitudinal covariates and the survival outcome. Because these methods are still quite new, little is known about their applicability, limitations and performance when applied to real-world data. To investigate these questions, we present a comparison of the predictive performance of the aforementioned methods and two simpler prediction approaches to three datasets that differ in terms of outcome type, sample size, number of longitudinal covariates and length of follow-up. We discuss how different modelling choices can have an impact on the possibility to accommodate unbalanced study designs and on computing time, and compare the predictive performance of the different approaches using a range of performance measures and landmark times.
翻译:近年来,生物医学研究中重复测量数据的日益可得性推动了多种统计方法的发展,这些方法适用于存在大量(可能高维)纵向协变量时的生存动态预测。这些方法在纵向协变量轨迹建模方式及其与生存结局关系的设定上存在差异。由于这些方法尚处于起步阶段,其在真实数据应用中的适用性、局限性及性能仍鲜为人知。为探究上述问题,我们基于三类在结局类型、样本量、纵向协变量数量及随访时长方面各异的真实数据集,对前述方法及两种更简易的预测方法进行了预测性能比较。我们讨论了不同建模选择如何影响非平衡研究设计的适应性及计算时间,并利用多种性能指标和里程碑时间点对各方法的预测性能进行了对比评估。