We consider linear structural equation models with explicitly modelled latent variables. In such models, observed and latent variables solve linear equations including stochastic noise terms. The goal of our work is to identify the direct causal effects between the observed variables of interest by providing (rational) formulas in the observed covariances. Most prior identification approaches operate in the latent projection framework, where latent variables are projected away into dependent error terms. However, when the observed variables are densely confounded, even if only by a few latent variables, the projection-based approaches are unable to certify identifiability of most effects. For such problems, approaches that explicitly use the latent variables are more effective, but algorithms that were recently proposed for this purpose often remain inconclusive for denser causal graphs. We develop a new identification criterion that is able to better handle dense graphs by leveraging the key insight that recursive identification schemes can be generalized by explicitly accounting for causal parents with (yet) unidentified direct effects. Combinatorial search problems in our new criterion can be tackled with the help of network-flow computations, leading to a practical useful algorithmic tool that we also make available in software.
翻译:我们考虑具有明确建模潜在变量的线性结构方程模型。在此类模型中,观测变量与潜在变量通过包含随机噪声项的线性方程相互关联。本研究旨在通过提供基于观测协方差的(有理)公式,识别感兴趣观测变量之间的直接因果效应。多数现有识别方法采用潜在投影框架,将潜在变量投影为相关误差项。然而,当观测变量受到密集混杂(即使仅涉及少量潜在变量)时,基于投影的方法无法验证大多数效应的可识别性。针对此类问题,显式利用潜在变量的方法更为有效,但近期提出的相关算法在处理稠密因果图时往往结论不明确。我们提出一种新的识别准则,通过关键性洞察——递归识别方案可通过显式考虑尚未识别直接效应的因果父节点进行泛化,从而更好地处理稠密图。该新准则中的组合搜索问题可借助网络流计算解决,由此形成实用的算法工具,并已实现为可获取的软件。