Causal discovery from observational data is a rather challenging, often impossible, task. However, an estimation of the causal structure is possible under certain assumptions on the data-generation process. Numerous commonly used methods rely on the additivity of noise in the structural equation models. Additivity implies that the variance or the tail of the effect, given the causes, is invariant; thus, the cause only affects the mean. However, the tail or other characteristics of the random variable can provide different information regarding the causal structure. Such cases have received very little attention in the literature thus far. Previous studies have revealed that the causal graph is identifiable under different models, such as linear non-Gaussian, post-nonlinear, or quadratic variance functional models. In this study, we introduce a new class of models called the conditional parametric causal models (CPCM), where the cause affects different characteristics of the effect. We use sufficient statistics to reveal the identifiability of the CPCM models in the exponential family of conditional distributions. Moreover, we propose an algorithm for estimating the causal structure from a random sample from the CPCM. The empirical properties of the methodology are studied for various datasets, including an application on the expenditure behavior of residents of the Philippines.
翻译:从观测数据中发现因果关系是一项相当具有挑战性且往往不可能完成的任务。然而,在数据生成过程的某些假设下,对因果结构的估计是可行的。许多常用方法依赖于结构方程模型中噪声的可加性。可加性意味着给定原因时,效应的方差或尾部保持不变;因此,原因仅影响均值。然而,随机变量的尾部或其他特征可能提供关于因果结构的不同信息。截至目前,此类情况在文献中受到的关注甚少。先前研究已揭示,在线性非高斯、后非线性或二次方差函数模型等不同模型下,因果图是可识别的。在本研究中,我们引入了一类新模型,称为条件参数化因果模型(CPCM),其中原因影响效应的不同特征。我们利用充分统计量揭示了CPCM模型在指数族条件分布中的可识别性。此外,我们提出了一种从CPCM随机样本中估计因果结构的算法。该方法的经验性质通过多种数据集进行了研究,包括一项针对菲律宾居民消费行为的应用。