Linear non-Gaussian causal models postulate that each random variable is a linear function of parent variables and non-Gaussian exogenous error terms. We study identification of the linear coefficients when such models contain latent variables. Our focus is on the commonly studied acyclic setting, where each model corresponds to a directed acyclic graph (DAG). For this case, prior literature has demonstrated that connections to overcomplete independent component analysis yield effective criteria to decide parameter identifiability in latent variable models. However, this connection is based on the assumption that the observed variables linearly depend on the latent variables. Departing from this assumption, we treat models that allow for arbitrary non-linear latent confounding. Our main result is a graphical criterion that is necessary and sufficient for deciding the generic identifiability of direct causal effects. Moreover, we provide an algorithmic implementation of the criterion with a run time that is polynomial in the number of observed variables. Finally, we report on estimation heuristics based on the identification result and explore a generalization to models with feedback loops.
翻译:线性非高斯因果模型假设每个随机变量是父变量与非高斯外生误差项的线性函数。本研究探讨此类模型包含潜变量时线性系数的识别问题。我们聚焦于普遍研究的无环场景,其中每个模型对应一个有向无环图(DAG)。针对此情形,已有文献证明,通过与过完备独立成分分析的关联可推导出判定潜变量模型中参数可识别性的有效准则。然而,该关联基于观测变量线性依赖于潜变量的假设。本文突破此假设,研究允许任意非线性潜在混杂的模型。我们的核心成果是一个图论准则,该准则对于判定直接因果效应的泛可识别性既是必要的也是充分的。此外,我们提供了该准则的算法实现,其运行时间在观测变量数量上呈多项式复杂度。最后,我们基于识别结果报告了估计启发式方法,并探索了向含反馈环模型的推广。