This paper considers a challenging problem of identifying a causal graphical model under the presence of latent variables. While various identifiability conditions have been proposed in the literature, they often require multiple pure children per latent variable or restrictions on the latent causal graph. Furthermore, it is common for all observed variables to exhibit the same modality. Consequently, the existing identifiability conditions are often too stringent for complex real-world data. We consider a general nonparametric measurement model with arbitrary observed variable types and binary latent variables, and propose a double triangular graphical condition that guarantees identifiability of the entire causal graphical model. The proposed condition significantly relaxes the popular pure children condition. We also establish necessary conditions for identifiability and provide valuable insights into fundamental limits of identifiability. Simulation studies verify that latent structures satisfying our conditions can be accurately estimated from data.
翻译:本文探讨了在存在隐变量情况下识别因果图模型这一具有挑战性的问题。尽管文献中已提出多种可识别性条件,但这些条件通常要求每个隐变量具有多个纯子节点,或对隐变量因果图施加限制。此外,现有研究普遍假设所有观测变量具有相同模态。因此,现有可识别性条件对于复杂的现实世界数据往往过于严格。我们考虑一种具有任意观测变量类型和二元隐变量的通用非参数测量模型,并提出一种双重三角图条件,该条件能保证整个因果图模型的可识别性。所提出的条件显著放宽了流行的纯子节点条件。我们还建立了可识别性的必要条件,并为理解可识别性的基本限制提供了重要见解。仿真研究验证了满足我们条件的隐结构可以从数据中准确估计。