We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Proxy methods accomplish this task using two proxy variables related to the latent confounder: a treatment proxy (related to the treatment) and an outcome proxy (related to the outcome). Two approaches have been proposed to perform causal effect estimation given proxy variables; however only one of these has found mainstream acceptance, since the other was understood to require density ratio estimation - a challenging task in high dimensions. In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. We employ kernel ridge regression to derive estimators, resulting in simple closed-form solutions for dose-response and conditional dose-response curves, along with consistency guarantees. Our methods empirically demonstrate superior or comparable performance to existing frameworks on synthetic and real-world datasets.
翻译:我们针对代理因果学习(PCL)这一设定展开研究,其目标是在存在隐藏混杂因素的情况下,从观测数据中估计因果效应。代理方法通过使用与潜在混杂因素相关的两个代理变量来完成此任务:一个与处理相关的处理代理变量,以及一个与结果相关的结果代理变量。现有文献提出了两种基于代理变量进行因果效应估计的方法;然而,其中仅有一种方法获得了主流认可,因为另一种方法被认为需要进行密度比估计——在高维空间中这是一项极具挑战性的任务。在本研究中,我们提出了第二种方法的一种实用且高效的实现方案,该方案绕过了显式的密度比估计,并适用于连续和高维处理变量。我们采用核岭回归来推导估计量,从而得到了针对剂量-响应曲线和条件剂量-响应曲线的简单闭式解,并附有一致性保证。我们的方法在合成数据集和真实世界数据集上的实证结果表明,其性能优于或可与现有框架相媲美。