Several epidemiological studies have provided evidence that long-term exposure to fine particulate matter (PM2.5) increases mortality risk. Furthermore, some population characteristics (e.g., age, race, and socioeconomic status) might play a crucial role in understanding vulnerability to air pollution. To inform policy, it is necessary to identify mutually exclusive groups of the population that are more or less vulnerable to air pollution. In the causal inference literature, the Conditional Average Treatment Effect (CATE) is a commonly used metric designed to characterize the heterogeneity of a treatment effect based on some population characteristics. In this work, we introduce a novel modeling approach, called Confounder-Dependent Bayesian Mixture Model (CDBMM), to characterize causal effect heterogeneity. More specifically, our method leverages the flexibility of the Dependent Dirichlet Process to model the distribution of the potential outcomes conditionally to the covariates, thus enabling us to: (i) estimate individual treatment effects, (ii) identify heterogeneous and mutually exclusive population groups defined by similar CATEs, and (iii) estimate causal effects within each of the identified groups. Through simulations, we demonstrate the effectiveness of our method in uncovering key insights about treatment effects heterogeneity. We apply our method to claims data from Medicare enrollees in Texas. We found seven mutually exclusive groups where the causal effects of PM2.5 on mortality are heterogeneous.
翻译:多项流行病学研究表明,长期暴露于细颗粒物(PM2.5)会增加死亡风险。此外,部分人群特征(如年龄、种族和社会经济地位)可能在理解空气污染易感性中起关键作用。为制定政策,需识别对空气污染更易感或更不易感的互斥人群组。在因果推断文献中,条件平均处理效应(CATE)是基于人群特征刻画处理效应异质性的常用指标。本研究提出一种称为混杂因子依赖的贝叶斯混合模型(CDBMM)的新型建模方法,以刻画因果效应异质性。具体而言,本方法利用依赖狄利克雷过程的灵活性,在给定协变量条件下对潜在结果分布进行建模,从而能够:(i) 估计个体处理效应,(ii) 识别基于相似CATE定义且互斥的异质人群组,以及(iii) 估计各识别组内的因果效应。通过模拟实验,我们证明了该方法在揭示处理效应异质性关键见解方面的有效性。我们将该方法应用于德克萨斯州医疗保险参保者的索赔数据,发现七个互斥组中PM2.5对死亡率的因果效应呈现异质性。