Prediction invariance of causal models under heterogeneous settings has been exploited by a number of recent methods for causal discovery, typically focussing on recovering the causal parents of a target variable of interest. When instrumental variables are not available, the causal Dantzig estimator exploits invariance under the more restrictive case of shift interventions. However, also in this case, one requires observational data from a number of sufficiently different environments, which is rarely available. In this paper, we consider a structural equation model where the target variable is described by a generalised additive model conditional on its parents. Besides having finite moments, no modelling assumptions are made on the conditional distributions of the other variables in the system. Under this setting, we characterise the causal model uniquely by means of two key properties: the Pearson residuals are invariant under the causal model and conditional on the causal parents the causal parameters maximise the population likelihood. These two properties form the basis of a computational strategy for searching the causal model among all possible models. Crucially, for generalised linear models with a known dispersion parameter, such as Poisson and logistic regression, the causal model can be identified from a single data environment.
翻译:因果模型在异质环境下的预测不变性已被近期多种因果发现方法所利用,这些方法通常侧重于恢复目标变量的因果父节点。当工具变量不可得时,因果Dantzig估计器利用了更严格的平移干预情形下的不变性。然而,即使在此情形下,仍需要来自多个充分不同环境的观测数据,而这在实践中往往难以获得。本文考虑一个结构方程模型,其中目标变量通过广义加性模型由其父节点条件描述。除要求具有有限矩外,不对系统中其他变量的条件分布施加任何建模假设。在此设定下,我们通过两个关键性质刻画因果模型的唯一性:Pearson残差在因果模型下具有不变性,且在给定因果父节点的条件下因果参数使总体似然最大化。这两个性质构成了在所有可能模型中搜索因果模型的计算策略基础。至关重要的是,对于具有已知离散参数的广义线性模型(如泊松回归和逻辑回归),因果模型可从单一数据环境中被识别。