Time-to-event analysis (survival analysis) is used when the response of interest is the time until a pre-specified event occurs. Time-to-event data are sometimes discrete either because time itself is discrete or due to grouping of failure times into intervals or rounding off measurements. In addition, the failure of an individual could be one of several distinct failure types, known as competing risks (events). Most methods and software packages for survival regression analysis assume that time is measured on a continuous scale. It is well-known that naively applying standard continuous-time models with discrete-time data may result in biased estimators of the discrete-time models. The Python package PyDTS, for simulating, estimating and evaluating semi-parametric competing-risks models for discrete-time survival data, is introduced. The package implements a fast procedure that enables including regularized regression methods, such as LASSO and elastic net, among others. A simulation study showcases flexibility and accuracy of the package. The utility of the package is demonstrated by analysing the Medical Information Mart for Intensive Care (MIMIC) - IV dataset for prediction of hospitalization length of stay.
翻译:时间-事件分析(生存分析)适用于响应变量为预定义事件发生时间的情况。时间-事件数据有时是离散的,其原因要么是时间本身具有离散性,要么是由于将失效时间分组为区间或对测量值进行四舍五入。此外,个体的失效可能属于几种不同的失效类型,即竞争风险(事件)。大多数生存回归分析方法和软件包都假设时间是在连续尺度上测量的。众所周知,将标准的连续时间模型直接应用于离散时间数据可能会导致离散时间模型的有偏估计。本文介绍了Python包PyDTS,用于模拟、估计和评估离散时间生存数据的半参数竞争风险模型。该包实现了一种快速程序,能够纳入LASSO和弹性网络等正则化回归方法。仿真研究展示了该包的灵活性和准确性。通过分析重症监护医疗信息集市(MIMIC)-IV数据集(用于预测住院时长),验证了该包的实用性。