Motivated by the need to analyze continuously updated data sets in the context of time-to-event modeling, we propose a novel nonparametric approach to estimate the conditional hazard function given a set of continuous and discrete predictors. The method is based on a representation of the conditional hazard as a ratio between a joint density and a conditional expectation determined by the distribution of the observed variables. It is shown that such ratio representations are available for uni- and bivariate time-to-events, in the presence of common types of random censoring, truncation, and with possibly cured individuals, as well as for competing risks. This opens the door to nonparametric approaches in many time-to-event predictive models. To estimate joint densities and conditional expectations we propose the recursive kernel smoothing, which is well suited for online estimation. Asymptotic results for such estimators are derived and it is shown that they achieve optimal convergence rates. Simulation experiments show the good finite sample performance of our recursive estimator with right censoring. The method is applied to a real dataset of primary breast cancer.
翻译:受时间-事件建模中需要分析连续更新数据集的启发,我们提出了一种新颖的非参数方法来估计给定一组连续和离散预测变量的条件风险函数。该方法基于将条件风险表示为联合密度与由观测变量分布决定的条件期望之比。研究表明,这种比率表示形式适用于单变量和双变量时间-事件数据,在存在常见类型的随机删失、截断以及可能包含治愈个体的情况下,以及对于竞争风险模型均成立。这为许多时间-事件预测模型中的非参数方法打开了大门。为了估计联合密度和条件期望,我们提出了递归核平滑方法,该方法非常适用于在线估计。我们推导了此类估计量的渐近结果,并证明其达到了最优收敛速率。模拟实验表明,我们的递归估计量在右删失情况下具有良好的有限样本性能。该方法应用于原发性乳腺癌的真实数据集。