In Causal Discovery with latent variables, We define two data paradigms: definite data: a single-skeleton structure with observed nodes single-value, and indefinite data: a set of multi-skeleton structures with observed nodes multi-value. Multi,skeletons induce low sample utilization and multi values induce incapability of the distribution assumption, both leading that recovering causal relations from indefinite data is, as of yet, largely unexplored. We design the causal strength variational model to settle down these two problems. Specifically, we leverage the causal strength instead of independent noise as latent variable to mediate evidence lower bound. By this design ethos, The causal strength of different skeletons is regarded as a distribution and can be expressed as a single-valued causal graph matrix. Moreover, considering the latent confounders, we disentangle the causal graph G into two relatisubgraphs O and C. O contains pure relations between observed nodes, while C represents the relations from latent variables to observed nodes. We summarize the above designs as Confounding Disentanglement Causal Discovery (biCD), which is tailored to learn causal representation from indefinite data under the latent confounding. Finally, we conduct comprehensive experiments on synthetic and real-world data to demonstrate the effectiveness of our method.
翻译:在含隐变量的因果发现中,我们定义了两种数据范式:确定数据——具有单值观测节点的单骨架结构,以及不定数据——具有多值观测节点的多骨架结构集合。多骨架导致样本利用率低,多值导致分布假设不可行,这两点共同使得至今从不定数据中恢复因果关系仍鲜有探索。我们设计了因果强度变分模型来解决这两个问题。具体而言,我们利用因果强度而非独立噪声作为隐变量来介导证据下界。通过这一设计理念,不同骨架的因果强度被视为一种分布,并可表示为单值因果图矩阵。此外,考虑隐混淆因素,我们将因果图G解耦为两个关系子图O和C:O包含观测节点间的纯关系,而C表示隐变量到观测节点的关系。我们将上述设计总结为混淆解耦因果发现(biCD),该方法专用于在隐混淆条件下从不定数据中学习因果表示。最后,我们在合成数据和真实数据上开展了全面实验,证明了我们方法的有效性。