Instrumental variable (IV) regression relies on instruments to infer causal effects from observational data with unobserved confounding. We consider IV regression in time series models, such as vector auto-regressive (VAR) processes. Direct applications of i.i.d. techniques are generally inconsistent as they do not correctly adjust for dependencies in the past. In this paper, we outline the difficulties that arise due to time structure and propose methodology for constructing identifying equations that can be used for consistent parametric estimation of causal effects in time series data. One method uses extra nuisance covariates to obtain identifiability (an idea that can be of interest even in the i.i.d. case). We further propose a graph marginalization framework that allows us to apply nuisance IV and other IV methods in a principled way to time series. Our methods make use of a version of the global Markov property, which we prove holds for VAR(p) processes. For VAR(1) processes, we prove identifiability conditions that relate to Jordan forms and are different from the well-known rank conditions in the i.i.d. case (they do not require as many instruments as covariates, for example). We provide methods, prove their consistency, and show how the inferred causal effect can be used for distribution generalization. Simulation experiments corroborate our theoretical results. We provide ready-to-use Python code.
翻译:工具变量回归依赖于工具变量从存在未观测混杂的观测数据中推断因果效应。我们考虑时间序列模型(如向量自回归过程)中的工具变量回归。直接应用独立同分布技术通常不一致,因为它们未能正确处理过去依赖关系。本文阐述了时间结构带来的困难,并提出了构建可用于时间序列数据中因果效应一致参数估计的识别方程的完整方法。一种方法利用额外扰动协变量实现可识别性(这一思想在独立同分布情形下也具有参考价值)。我们进一步提出图边际化框架,使扰动工具变量及其他工具变量方法能系统应用于时间序列。该方法运用了全局马尔可夫性的一种变体,我们证明了该性质对VAR(p)过程成立。针对VAR(1)过程,我们证明了与若尔当标准型相关的可识别性条件,这些条件不同于独立同分布情形下熟知的秩条件(例如,它们不要求工具变量数与协变量数相等)。我们提供了具体方法,证明其一致性,并展示了推断所得因果效应如何用于分布泛化。仿真实验验证了理论结果,并提供了可直接使用的Python代码。