Several epidemiological studies have provided evidence that long-term exposure to fine particulate matter (PM2.5) increases mortality risk. Furthermore, some population characteristics (e.g., age, race, and socioeconomic status) might play a crucial role in understanding vulnerability to air pollution. To inform policy, it is necessary to identify groups of the population that are more or less vulnerable to air pollution. In causal inference literature, the Group Average Treatment Effect (GATE) is a distinctive facet of the conditional average treatment effect. This widely employed metric serves to characterize the heterogeneity of a treatment effect based on some population characteristics. In this work, we introduce a novel Confounder-Dependent Bayesian Mixture Model (CDBMM) to characterize causal effect heterogeneity. More specifically, our method leverages the flexibility of the dependent Dirichlet process to model the distribution of the potential outcomes conditionally to the covariates and the treatment levels, thus enabling us to: (i) identify heterogeneous and mutually exclusive population groups defined by similar GATEs in a data-driven way, and (ii) estimate and characterize the causal effects within each of the identified groups. Through simulations, we demonstrate the effectiveness of our method in uncovering key insights about treatment effects heterogeneity. We apply our method to claims data from Medicare enrollees in Texas. We found six mutually exclusive groups where the causal effects of PM2.5 on mortality are heterogeneous.
翻译:多项流行病学研究已证实,长期暴露于细颗粒物(PM2.5)会增加死亡风险。此外,部分人群特征(如年龄、种族与社会经济地位)可能在理解空气污染脆弱性方面发挥关键作用。为制定政策依据,有必要识别对空气污染更易感或更具耐受性的人群亚组。在因果推断文献中,群组平均处理效应(GATE)是条件平均处理效应的重要表征维度,这一广泛应用指标旨在基于人群特征刻画处理效应的异质性。本研究提出一种新型的基于混杂因素的贝叶斯混合模型(CDBMM)以表征因果效应异质性。具体而言,该方法利用依赖狄利克雷过程的灵活性,对潜在结局在协变量与处理水平条件下的分布进行建模,从而能:(i)以数据驱动方式识别由相似GATE定义的异质性互斥人群群组,及(ii)估计与表征各识别群组内的因果效应。通过模拟实验,我们证明了该方法在揭示处理效应异质性关键洞见方面的有效性。我们将该方法应用于德克萨斯州联邦医疗保险参保者的索赔数据,发现六个互斥群组中PM2.5对死亡率的因果效应呈现异质性。