Negative control is a strategy for learning the causal relationship between treatment and outcome in the presence of unmeasured confounding. The treatment effect can nonetheless be identified if two auxiliary variables are available: a negative control treatment (which has no effect on the actual outcome), and a negative control outcome (which is not affected by the actual treatment). These auxiliary variables can also be viewed as proxies for a traditional set of control variables, and they bear resemblance to instrumental variables. I propose a family of algorithms based on kernel ridge regression for learning nonparametric treatment effects with negative controls. Examples include dose response curves, dose response curves with distribution shift, and heterogeneous treatment effects. Data may be discrete or continuous, and low, high, or infinite dimensional. I prove uniform consistency and provide finite sample rates of convergence. I estimate the dose response curve of cigarette smoking on infant birth weight adjusting for unobserved confounding due to household income, using a data set of singleton births in the state of Pennsylvania between 1989 and 1991.
翻译:负对照是一种在存在未测量混杂因素时学习处理与结果之间因果关系的策略。通过引入两个辅助变量——负对照处理(对实际结果无影响)和负对照结果(不受实际处理影响),仍可识别处理效应。这些辅助变量亦可视为传统控制变量集的代理变量,且与工具变量具有相似性。本文提出一系列基于核岭回归的算法,用于学习带负对照的非参数处理效应。示例包括剂量反应曲线、分布偏移下的剂量反应曲线以及异质性处理效应。数据可为离散型或连续型,且支持低维、高维乃至无限维空间。我证明了算法的一致收敛性,并给出了有限样本收敛速率。利用1989年至1991年间宾夕法尼亚州单胎出生数据集,本文估计了吸烟对婴儿出生体重的剂量反应曲线,校正了因家庭收入导致的未观测混杂效应。