Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.
翻译:工具变量回归是一种标准策略,通过利用仅通过干预影响结果变量的工具变量,从观测数据中学习混杂干预与结果变量之间的因果关系。在经典工具变量回归中,学习过程分为两个阶段:第一阶段对从工具变量到干预变量进行线性回归;第二阶段对从干预变量到结果变量进行线性回归,并以工具变量为条件。我们提出了一种新方法——深度特征工具变量回归,以处理工具变量、干预变量和结果变量之间可能为非线性的情形。在此情况下,深度神经网络经过训练,可从工具变量和干预变量中定义信息性非线性特征。我们提出了一种针对这些特征的交替训练机制,以确保在组合第一和第二阶段时获得良好的端到端性能,从而以计算高效的方式获得高度灵活的特征映射。在具有挑战性的工具变量基准测试中,包括涉及高维图像数据的设置,深度特征工具变量回归优于近期最先进的方法。此外,在强化学习的离线策略评估(可视为一项工具变量回归任务)中,深度特征工具变量回归也展现出具有竞争力的性能。