Target trial emulation (TTE) is a popular framework for observational studies based on electronic health records (EHR). A key component of this framework is determining the patient population eligible for inclusion in both a target trial of interest and its observational emulation. Missingness in variables that define eligibility criteria, however, presents a major challenge towards determining the eligible population when emulating a target trial with an observational study. In practice, patients with incomplete data are almost always excluded from analysis despite the possibility of selection bias, which can arise when subjects with observed eligibility data are fundamentally different than excluded subjects. Despite this, to the best of our knowledge, very little work has been done to mitigate this concern. In this paper, we propose a novel conceptual framework to address selection bias in TTE studies, tailored towards time-to-event endpoints, and describe estimation and inferential procedures via inverse probability weighting (IPW). Under an EHR-based simulation infrastructure, developed to reflect the complexity of EHR data, we characterize common settings under which missing eligibility data poses the threat of selection bias and investigate the ability of the proposed methods to address it. Finally, using EHR databases from Kaiser Permanente, we demonstrate the use of our method to evaluate the effect of bariatric surgery on microvascular outcomes among a cohort of severely obese patients with Type II diabetes mellitus (T2DM).
翻译:目标试验模拟(TTE)是基于电子健康记录(EHR)的观察性研究中广泛采用的框架。该框架的一个关键环节是确定符合纳入目标试验及其观察性模拟的患者人群。然而,在定义资格标准的变量存在缺失时,利用观察性研究模拟目标试验以确定合格人群面临重大挑战。实践中,尽管存在选择偏倚的可能性——即具备完整资格数据的受试者与排除的受试者可能存在本质差异——数据不完整的患者几乎总是被排除在分析之外。尽管如此,据我们所知,目前极少有研究致力于缓解这一问题。本文提出了一种新颖的概念框架,专门针对时间至事件终点,以解决TTE研究中的选择偏倚问题,并通过逆概率加权(IPW)描述了估计与推断流程。在基于EHR开发的模拟基础设施(旨在反映EHR数据的复杂性)下,我们刻画了缺失资格数据可能引发选择偏倚风险的常见情境,并检验了所提方法应对此问题的能力。最后,利用凯撒医疗集团的EHR数据库,我们演示了该方法在评估减重手术对严重肥胖的II型糖尿病(T2DM)患者微血管结局影响中的应用。