Learning causal relationships from empirical observations is a central task in scientific research. A common method is to employ structural causal models that postulate noisy functional relations among a set of interacting variables. To ensure unique identifiability of causal directions, researchers consider restricted subclasses of structural causal models. Post-nonlinear (PNL) causal models constitute one of the most flexible options for such restricted subclasses, containing in particular the popular additive noise models as a further subclass. However, learning PNL models is not well studied beyond the bivariate case. The existing methods learn non-linear functional relations by minimizing residual dependencies and subsequently test independence from residuals to determine causal orientations. However, these methods can be prone to overfitting and, thus, difficult to tune appropriately in practice. As an alternative, we propose a new approach for PNL causal discovery that uses rank-based methods to estimate the functional parameters. This new approach exploits natural invariances of PNL models and disentangles the estimation of the non-linear functions from the independence tests used to find causal orientations. We prove consistency of our method and validate our results in numerical experiments.
翻译:从经验观测中学习因果关系是科学研究中的核心任务。常用方法之一是采用结构化因果模型,该模型假设一组交互变量之间存在带噪声的函数关系。为确保因果方向的唯一可识别性,研究者考虑了结构化因果模型的受限子类。后非线性(PNL)因果模型是此类受限子类中最灵活的选项之一,尤其包含了流行的加性噪声模型作为其进一步子类。然而,在双变量情形之外,PNL模型的学习尚未得到充分研究。现有方法通过最小化残差依赖性来学习非线性函数关系,随后通过检验残差独立性来确定因果方向。但这些方法容易过拟合,因而在实践中难以恰当调整参数。作为替代,我们提出了一种新的PNL因果发现方法,该方法利用基于秩的估计量来估计函数参数。这一新方法利用了PNL模型的自然不变性,将非线性函数的估计与用于确定因果方向的独立性检验相分离。我们证明了该方法的一致性,并通过数值实验验证了结果。