Negative control is a strategy for learning the causal relationship between treatment and outcome in the presence of unmeasured confounding. The treatment effect can nonetheless be identified if two auxiliary variables are available: a negative control treatment (which has no effect on the actual outcome), and a negative control outcome (which is not affected by the actual treatment). These auxiliary variables can also be viewed as proxies for a traditional set of control variables, and they bear resemblance to instrumental variables. I propose a family of algorithms based on kernel ridge regression for learning nonparametric treatment effects with negative controls. Examples include dose response curves, dose response curves with distribution shift, and heterogeneous treatment effects. Data may be discrete or continuous, and low, high, or infinite dimensional. I prove uniform consistency and provide finite sample rates of convergence. I estimate the dose response curve of cigarette smoking on infant birth weight adjusting for unobserved confounding due to household income, using a data set of singleton births in the state of Pennsylvania between 1989 and 1991.
翻译:阴性对照是一种在存在未观测混杂时学习处理与结果间因果关系的策略。若存在两个辅助变量,即对实际结果无影响的阴性对照处理变量和不受实际处理影响的阴性对照结果变量,处理效应仍可被识别。这些辅助变量也可视为传统控制变量集的代理变量,并与工具变量具有相似性。本文提出一系列基于核岭回归的算法,用于学习具有阴性对照的非参数处理效应。研究实例涵盖剂量反应曲线、带分布偏移的剂量反应曲线以及异质性处理效应。数据可为离散型、连续型,或低维、高维乃至无限维。本文证明了算法的相合性及有限样本收敛速率。基于1989-1991年间宾夕法尼亚州单胎出生数据集,本文估算了吸烟对婴儿出生体重的剂量反应曲线,并调整了因家庭收入产生的未观测混杂效应。