We consider modeling a binary response variable together with a set of covariates for two groups under observational data. The grouping variable can be the confounding variable (the common cause of treatment and outcome), gender, case/control, ethnicity, etc. Given the covariates and a binary latent variable, the goal is to construct two directed acyclic graphs (DAGs), while sharing some common parameters. The set of nodes, which represent the variables, are the same for both groups but the directed edges between nodes, which represent the causal relationships between the variables, can be potentially different. For each group, we also estimate the effect size for each node. We assume that each group follows a Gaussian distribution under its DAG. Given the parent nodes, the joint distribution of DAG is conditionally independent due to the Markov property of DAGs. We introduce the concept of Gaussian DAG-probit model under two groups and hence doubly Gaussian DAG-probit model. To estimate the skeleton of the DAGs and the model parameters, we took samples from the posterior distribution of doubly Gaussian DAG-probit model via MCMC method. We validated the proposed method using a comprehensive simulation experiment and applied it on two real datasets. Furthermore, we validated the results of the real data analysis using well-known experimental studies to show the value of the proposed grouping variable in the causality domain.
翻译:我们考虑在观测数据下,对两组数据的二元响应变量及一组协变量进行建模。分组变量可以是混杂变量(治疗与结果的共同原因)、性别、病例/对照、种族等。给定协变量和二元潜变量,目标是构建两个有向无环图(DAG),同时共享某些共同参数。两组数据的节点集(代表变量)相同,但节点间的有向边(代表变量间的因果关系)可能不同。对于每组数据,我们还估计每个节点的影响大小。我们假设每组数据在其DAG下服从高斯分布。给定父节点后,由于DAG的马尔可夫性质,其联合分布条件独立。我们引入两组数据下的高斯DAG-probit模型概念,进而提出双高斯DAG-probit模型。为估计DAG的骨架和模型参数,我们通过MCMC方法从双高斯DAG-probit模型的后验分布中采样。我们通过全面的模拟实验验证了所提方法,并将其应用于两个真实数据集。此外,我们利用著名的实验研究验证了真实数据分析的结果,以展示所提出的分组变量在因果推断领域的价值。