In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more efficient HTE estimate, such integrative analysis requires great insight into the data generation mechanism, particularly the accurate characterization of unmeasured confounding effects/bias. With this aim, we propose a penalized-regression-based integrative approach that allows for the simultaneous estimation of parameters, selection of variables, and identification of the existence of unmeasured confounding effects. The consistency, asymptotic normality, and efficiency gains are rigorously established for the proposed estimate. Finally, we apply the proposed method to estimate the HTE of lobar/sublobar resection on the survival of lung cancer patients. The RCT is a multicenter non-inferiority randomized phase 3 trial, and the RWD comes from a clinical oncology cancer registry in the United States. The analysis reveals that the unmeasured confounding exists and the integrative approach does enhance the efficiency for the HTE estimation.
翻译:本研究聚焦于生存结局的异质性处理效应估计。结局数据存在删失,且协变量为高维。我们同时利用被视为金标准的随机对照试验数据以及可能受隐藏混杂因素影响的真实世界数据。为实现更高效的异质性处理效应估计,此类整合分析需深入理解数据生成机制,特别是对未测量混杂效应/偏倚的精确刻画。为此,我们提出一种基于惩罚回归的整合方法,该方法能够同时实现参数估计、变量选择以及未测量混杂效应存在性的识别。我们严格证明了所提估计量的一致性、渐近正态性及效率提升。最后,我们将所提方法应用于评估肺叶/亚肺叶切除术对肺癌患者生存期的异质性处理效应。随机对照试验数据来自一项多中心非劣效性随机3期试验,真实世界数据则来源于美国临床肿瘤学癌症登记库。分析表明未测量混杂确实存在,且整合方法有效提升了异质性处理效应估计的效率。