Under two-phase designs, the outcome and several covariates and confounders are measured in the first phase, and a new predictor of interest, which may be costly to collect, can be measured on a subsample in the second phase, without incurring the costs of recruiting subjects. By using the information gathered in the first phase, the second-phase subsample can be selected to enhance the efficiency of testing and estimating the effect of the new predictor on the outcome. Past studies have focused on optimal two-phase sampling schemes for statistical inference on local ($\beta = o(1)$) effects of the predictor of interest. In this study, we propose an extension of the two-phase designs that employs an optimal sampling scheme for estimating predictor effects with pseudo conditional likelihood estimators in case-control studies. This approach is applicable to both local and non-local effects. We demonstrate the effectiveness of the proposed sampling scheme through simulation studies and analysis of data from 170 patients hospitalized for treatment of COVID-19. The results show a significant improvement in the estimation of the parameter of interest.
翻译:在两阶段设计中,第一阶段测量结局、若干协变量和混杂因素,第二阶段对子样本测量可能需高成本收集的新预测因子,而无需承担招募受试者的费用。通过利用第一阶段收集的信息,可选择性抽取第二阶段子样本,以提高检验和估计新预测因子对结局效应的效率。以往研究聚焦于局部($\beta = o(1)$)预测因子效应的两阶段最优抽样方案。本研究提出一种两阶段设计的扩展方法,采用最优抽样方案,在病例-对照研究中通过伪条件似然估计量估计预测因子效应。该方法适用于局部效应与非局部效应。我们通过模拟研究及对170例COVID-19住院患者数据的分析,证明了所提抽样方案的有效性。结果显示,目标参数的估计精度得到显著提升。