A case-cohort design is a two-phase sampling design frequently used to analyze censored survival data in a cost-effective way, where a subcohort is usually selected using simple random sampling or stratified simple random sampling. In this paper, we propose an efficient sampling procedure based on balanced sampling when selecting a subcohort in a case-cohort design. A sample selected via a balanced sampling procedure automatically calibrates auxiliary variables. When fitting a Cox model, calibrating sampling weights has been shown to lead to more efficient estimators of the regression coefficients (Breslow et al., 2009a, b). The reduced variabilities over its counterpart with a simple random sampling are shown via extensive simulation experiments. The proposed design and estimation procedure are also illustrated with the well-known National Wilms Tumor Study dataset.
翻译:病例队列设计是一种常用于经济高效地分析删失生存数据的两阶段抽样设计,其子队列通常通过简单随机抽样或分层简单随机抽样选取。本文提出一种在病例队列设计中选取子队列时基于平衡抽样的高效抽样方法。通过平衡抽样选取的样本能自动校准辅助变量。在拟合Cox模型时,已有研究表明校准抽样权重可获得回归系数更有效的估计量(Breslow等,2009a, b)。大量模拟实验表明,相较于简单随机抽样方案,该方法能有效降低估计量的变异性。所提出的设计与估计方法亦通过著名的国家威尔姆斯肿瘤研究数据集进行了实证展示。