Target trial emulation (TTE) is a popular framework for observational studies based on electronic health records (EHR). A key component of this framework is determining the patient population eligible for inclusion in both a target trial of interest and its observational emulation. Missingness in variables that define eligibility criteria, however, presents a major challenge towards determining the eligible population when emulating a target trial with an observational study. In practice, patients with incomplete data are almost always excluded from analysis despite the possibility of selection bias, which can arise when subjects with observed eligibility data are fundamentally different than excluded subjects. Despite this, to the best of our knowledge, very little work has been done to mitigate this concern. In this paper, we propose a novel conceptual framework to address selection bias in TTE studies, tailored towards time-to-event endpoints, and describe estimation and inferential procedures via inverse probability weighting (IPW). Under an EHR-based simulation infrastructure, developed to reflect the complexity of EHR data, we characterize common settings under which missing eligibility data poses the threat of selection bias and investigate the ability of the proposed methods to address it. Finally, using EHR databases from Kaiser Permanente, we demonstrate the use of our method to evaluate the effect of bariatric surgery on microvascular outcomes among a cohort of severely obese patients with Type II diabetes mellitus (T2DM).
翻译:目标试验模拟(TTE)是基于电子健康记录(EHR)的观察性研究中一个常用框架。该框架的一个关键组成部分是确定符合纳入目标试验及其观察性模拟的患者群体。然而,在定义资格标准的变量存在缺失时,利用观察性研究模拟目标试验来确定合格人群面临重大挑战。实践中,尽管可能存在选择偏倚,数据不完整的患者几乎总是被排除在分析之外;当具有完整资格数据的受试者与被排除的受试者存在本质差异时,这种偏倚就可能产生。尽管如此,据我们所知,目前很少有工作致力于缓解这一问题。本文提出了一种新颖的概念框架,专门针对时间-事件终点,以解决TTE研究中的选择偏倚问题,并通过逆概率加权(IPW)描述了估计和推断程序。在一个为反映EHR数据复杂性而开发的、基于EHR的模拟基础设施下,我们刻画了缺失资格数据可能带来选择偏倚威胁的常见场景,并研究了所提方法应对该问题的能力。最后,利用来自Kaiser Permanente的EHR数据库,我们演示了如何使用该方法评估减重手术对一组患有II型糖尿病(T2DM)的严重肥胖患者微血管结局的影响。