We present a new method for causal discovery in linear structural equation models. We propose a simple ``trick'' based on statistical testing in linear models that can distinguish between ancestors and non-ancestors of any given variable. Naturally, this can then be extended to estimating the causal order among all variables. We provide explicit error control for false causal discovery, at least asymptotically. This holds true even under Gaussianity, where other methods fail due to non-identifiable structures. These type I error guarantees come at the cost of reduced empirical power. Additionally, we provide an asymptotically valid goodness of fit p-value to assess whether multivariate data stems from a linear structural equation model.
翻译:我们提出了一种在线性结构方程模型中进行因果发现的新方法。我们基于线性模型中的统计检验提出了一种简单的“技巧”,能够区分任意给定变量的祖先与非祖先。自然而言,这可以扩展为估计所有变量之间的因果顺序。我们为虚假因果发现提供了明确的错误控制,至少在渐近意义下成立。即使在高斯性条件下,当其他方法因结构不可识别而失效时,该控制依然有效。这些第一类错误保证以降低经验功效为代价。此外,我们提供了一个渐近有效的拟合优度p值,用于评估多变量数据是否源自线性结构方程模型。