Although there is a huge literature on feature selection for the Cox model, none of the existing approaches can control the false discovery rate (FDR) unless the sample size tends to infinity. In addition, there is no formal power analysis of the knockoffs framework for survival data in the literature. To address those issues, in this paper, we propose a novel controlled feature selection approach using knockoffs for the Cox model. We establish that the proposed method enjoys the FDR control in finite samples regardless of the number of covariates. Moreover, under mild regularity conditions, we also show that the power of our method is asymptotically one as sample size tends to infinity. To the best of our knowledge, this is the first formal theoretical result on the power for the knockoffs procedure in the survival setting. Simulation studies confirm that our method has appealing finite-sample performance with desired FDR control and high power. We further demonstrate the performance of our method through a real data example.
翻译:尽管关于Cox模型特征选择的文献浩如烟海,但现有方法均无法在有限样本下控制错误发现率(FDR),除非样本量趋于无穷大。此外,文献中尚未出现针对生存数据Knockoffs框架的正式功效分析。为解决这些问题,本文提出了一种基于Knockoffs的新型受控特征选择方法用于Cox模型。我们证明,无论协变量数量如何,所提方法在有限样本下均能实现FDR控制。同时,在温和正则条件下,我们还证明当样本量趋于无穷大时,该方法功效渐近趋于1。据我们所知,这是生存数据场景下Knockoffs程序首项关于功效的正式理论结果。模拟研究表明,该方法在有限样本下具有理想的FDR控制能力和高效能表现。我们进一步通过实际数据案例验证了所提方法的性能。