Instrumental variable (IV) regression relies on instruments to infer causal effects from observational data with unobserved confounding. We consider IV regression in time series models, such as vector auto-regressive (VAR) processes. Direct applications of i.i.d. techniques are generally inconsistent as they do not correctly adjust for dependencies in the past. In this paper, we outline the difficulties that arise due to time structure and propose methodology for constructing identifying equations that can be used for consistent parametric estimation of causal effects in time series data. One method uses extra nuisance covariates to obtain identifiability (an idea that can be of interest even in the i.i.d. case). We further propose a graph marginalization framework that allows us to apply nuisance IV and other IV methods in a principled way to time series. Our methods make use of a version of the global Markov property, which we prove holds for VAR(p) processes. For VAR(1) processes, we prove identifiability conditions that relate to Jordan forms and are different from the well-known rank conditions in the i.i.d. case (they do not require as many instruments as covariates, for example). We provide methods, prove their consistency, and show how the inferred causal effect can be used for distribution generalization. Simulation experiments corroborate our theoretical results. We provide ready-to-use Python code.
翻译:工具变量(IV)回归依赖工具变量从存在未观测混杂的观测数据中推断因果效应。我们考虑时间序列模型中的IV回归,例如向量自回归(VAR)过程。直接应用独立同分布技术通常会导致不一致性,因为它们未能正确调整历史依赖关系。本文阐述了时间结构带来的困难,并提出构建识别方程的方法论,可用于时间序列数据中因果效应的一致参数估计。一种方法利用额外的干扰协变量获得可识别性(该思想即使在独立同分布情形下也具参考价值)。我们进一步提出图边际化框架,以原则性方式将干扰IV及其他IV方法应用于时间序列。我们的方法运用了全局马尔可夫性质的特定形式,并证明该性质对VAR(p)过程成立。对于VAR(1)过程,我们证明了与若尔当型相关的可识别性条件,该条件不同于独立同分布情形中经典的秩条件(例如不需要工具变量数量与协变量数量相等)。我们提供了具体方法,证明其一致性,并展示如何将推断的因果效应用于分布泛化。仿真实验验证了理论结果。我们提供了可直接使用的Python代码。