Instrumental variables (IVs) are widely used to estimate causal effects from non-randomized data. A canonical example is a randomized trial with noncompliance, in which the randomized treatment assignment serves as an IV for the non-ignorable treatment received. Under a monotonicity assumption, a valid IV nonparametrically identifies the average treatment effect among a latent complier subgroup, whose generalizability is often under debate. In many studies, there exist multiple versions of an IV, for instance, different nudges to take the same treatment in different study sites in a multicenter clinical trial. These different versions of an IV may result in different compliance rates and offer a unique opportunity to study IV estimates' generalizability. In this article, we introduce a novel nested IV assumption and study identification of the average treatment effect among two latent subgroups: always-compliers and switchers, who are defined based on the joint potential treatment received under two versions of a binary IV. We derive the efficient influence function for the SWitcher Average Treatment Effect (SWATE) under a nonparametric model and propose efficient estimators. We then propose formal statistical tests of the generalizability of IV estimates under the nested IV framework. The proposed tests are flexible nonparametric generalizations of classical overidentification tests that allow estimating nuisance parameters using machine learning tools. We apply the proposed method to the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial and study the causal effect of colorectal cancer screening and its generalizability.
翻译:工具变量(IV)被广泛用于从非随机化数据中估计因果效应。一个典型例子是存在不依从性的随机试验,其中随机化处理分配作为不可忽略实际接受处理的工具变量。在单调性假设下,有效的工具变量能够非参数地识别一个潜在依从者亚组的平均处理效应,但其可推广性常受争议。在许多研究中,存在工具变量的多种版本,例如在多中心临床试验中,不同研究地点采用不同的干预措施以促使接受相同处理。这些不同版本的工具变量可能导致不同的依从率,并为研究工具变量估计的可推广性提供了独特机会。本文引入了一种新颖的嵌套工具变量假设,并研究了基于二元工具变量的两个版本下联合潜在接受处理所定义的两个潜在亚组——始终依从者与转换者——的平均处理效应识别。我们推导了非参数模型下转换者平均处理效应(SWATE)的有效影响函数,并提出了高效估计量。随后,我们在嵌套工具变量框架下提出了工具变量估计可推广性的正式统计检验方法。所提出的检验是经典过度识别检验的灵活非参数推广,允许使用机器学习工具估计 nuisance 参数。我们将所提方法应用于前列腺、肺、结直肠和卵巢(PLCO)癌症筛查试验,以研究结直肠癌筛查的因果效应及其可推广性。