We consider a a collection of categorical random variables. Of special interest is the causal effect on an outcome variable following an intervention on another variable. Conditionally on a Directed Acyclic Graph (DAG), we assume that the joint law of the random variables can be factorized according to the DAG, where each term is a categorical distribution for the node-variable given a configuration of its parents. The graph is equipped with a causal interpretation through the notion of interventional distribution and the allied "do-calculus". From a modeling perspective, the likelihood is decomposed into a product over nodes and parents of DAG-parameters, on which a suitably specified collection of Dirichlet priors is assigned. The overall joint distribution on the ensemble of DAG-parameters is then constructed using global and local independence. We account for DAG-model uncertainty and propose a reversible jump Markov Chain Monte Carlo (MCMC) algorithm which targets the joint posterior over DAGs and DAG-parameters; from the output we are able to recover a full posterior distribution of any causal effect coefficient of interest, possibly summarized by a Bayesian Model Averaging (BMA) point estimate. We validate our method through extensive simulation studies, wherein comparisons with alternative state-of-the-art procedures reveal an outperformance in terms of estimation accuracy. Finally, we analyze a dataset relative to a study on depression and anxiety in undergraduate students.
翻译:我们考虑一组分类随机变量。重点关注的是对某一变量施加干预后对结果变量的因果效应。基于有向无环图(DAG)的条件假设,我们假定随机变量的联合分布可根据该DAG进行因子分解,其中每个因子项表示节点变量在其父节点配置下的分类分布。该图通过干预分布概念及相关的"do-演算"赋予因果解释。从建模角度,似然函数被分解为各节点及其父节点对应的DAG参数乘积,并在这些参数上指定适当构造的狄利克雷先验分布族。通过全局独立性和局部独立性构建DAG参数集合的整体联合分布。我们考虑DAG模型的不确定性,并提出一种可逆跳跃马尔可夫链蒙特卡洛(MCMC)算法,该算法针对DAG与DAG参数的联合后验分布进行采样;从输出结果中可恢复任意感兴趣因果效应系数的完全后验分布,并可选择通过贝叶斯模型平均(BMA)点估计进行汇总。我们通过大量仿真研究验证该方法,与现有先进方法的比较表明其在估计精度方面具有优势。最后,我们分析了一项关于大学生抑郁与焦虑的研究数据集。