Linear non-Gaussian causal models postulate that each random variable is a linear function of parent variables and non-Gaussian exogenous error terms. We study identification of the linear coefficients when such models contain latent variables. Our focus is on the commonly studied acyclic setting, where each model corresponds to a directed acyclic graph (DAG). For this case, prior literature has demonstrated that connections to overcomplete independent component analysis yield effective criteria to decide parameter identifiability in latent variable models. However, this connection is based on the assumption that the observed variables linearly depend on the latent variables. Departing from this assumption, we treat models that allow for arbitrary non-linear latent confounding. Our main result is a graphical criterion that is necessary and sufficient for deciding the generic identifiability of direct causal effects. Moreover, we provide an algorithmic implementation of the criterion with a run time that is polynomial in the number of observed variables. Finally, we report on estimation heuristics based on the identification result, explore a generalization to models with feedback loops, and provide new results on the identifiability of the causal graph.
翻译:线性非高斯因果模型假设每个随机变量均为父变量与非高斯外生误差项的线性函数。本文研究此类模型包含潜变量时线性系数的识别问题。我们聚焦于常见的无环设定,其中每个模型对应一个有向无环图。针对该情形,已有研究证明通过与过完备独立成分分析的关联,可建立判定潜变量模型中参数可识别性的有效准则。然而,这种关联基于观测变量线性依赖于潜变量的假设。本文突破该假设,研究允许任意非线性潜变量混淆的模型。我们的主要成果是提出一个图判定准则,该准则对于判断直接因果效应的泛化可识别性具有充分必要性。此外,我们给出了该准则的算法实现,其运行时间在观测变量数量上呈多项式复杂度。最后,我们基于识别结果报告了估计启发式方法,探索了带反馈环模型的泛化形式,并提供了因果图可识别性的新结论。