Nonparametric Inference with an Instrumental Variable under a Separable Binary Treatment Choice Model

Instrumental variable (IV) methods are widely used to infer treatment effects in the presence of unmeasured confounding. In this paper, we study nonparametric inference with an IV under a separable binary treatment choice model, which posits that the odds of the probability of taking the treatment, conditional on the instrument and the treatment-free potential outcome, factor into separable components for each variable. While nonparametric identification of smooth functionals of the treatment-free potential outcome among the treated, such as the average treatment effect on the treated, has been established under this model, corresponding nonparametric efficient estimation has proven elusive due to variationally dependent nuisance parameters defined in terms of counterfactual quantities. To address this challenge, we introduce a new variationally independent parameterization based on nuisance functions defined directly from the observed data. This parameterization, coupled with a novel fixed-point argument, enables the use of modern machine learning methods for nuisance function estimation. We characterize the semiparametric efficiency bound for any smooth functional of the treatment-free potential outcome among the treated and construct a corresponding semiparametric efficient estimator without imposing any unnecessary restriction on nuisance functions. Furthermore, we describe a straightforward generative model justifying our identifying assumptions and characterize empirically falsifiable implications of the framework to evaluate our assumptions in practical settings. Our approach seamlessly extends to nonlinear treatment effects, population-level effects, and nonignorable missing data settings. We illustrate our methods through simulation studies and an application to the Job Corps study.

翻译：工具变量（IV）方法被广泛用于在存在未测量混杂因素的情况下推断处理效应。本文研究在可分离二元处理选择模型下使用工具变量进行非参数推断，该模型假设在给定工具变量和无处理潜在结果的条件下，接受处理的概率优势比可分解为各变量的可分离分量。虽然在该模型下已确立了处理组中无处理潜在结果的平滑泛函（例如处理组平均处理效应）的非参数可识别性，但由于以反事实量定义的干扰参数存在变分依赖性，相应的非参数有效估计一直难以实现。为应对这一挑战，我们引入了一种基于直接从观测数据定义的干扰函数的新型变分独立参数化方法。该参数化结合新颖的不动点论证，使得能够利用现代机器学习方法进行干扰函数估计。我们刻画了处理组中无处理潜在结果的任意平滑泛函的半参数效率界，并构建了相应的半参数有效估计量，且未对干扰函数施加任何不必要的限制。此外，我们描述了一个直接生成模型以证明识别假设的合理性，并刻画了该框架在经验上可证伪的蕴含关系，以在实际场景中评估我们的假设。我们的方法可无缝扩展到非线性处理效应、总体水平效应以及不可忽略缺失数据场景。我们通过模拟研究和Job Corps研究的应用实例来阐释所提出的方法。