The causal inference literature has increasingly recognized that explicitly targeting treatment effect heterogeneity can lead to improved scientific understanding and policy recommendations. Towards the same ends, studying the causal pathway connecting the treatment to the outcome can be also useful. This paper addresses these problems in the context of \emph{causal mediation analysis}. We introduce a varying coefficient model based on Bayesian additive regression trees to identify and regularize heterogeneous causal mediation effects; analogously with linear structural equation models, these effects correspond to covariate-dependent products of coefficients. We show that, even on large datasets with few covariates, LSEMs can produce highly unstable estimates of the conditional average direct and indirect effects, while our \emph{Bayesian causal mediation forests} model produces estimates that are stable. We find that our approach is conservative, with effect estimates ``shrunk towards homogeneity.'' We examine the salient properties of our method using both data from the Medical Expenditure Panel Survey and empirically-grounded simulated data. Finally, we show how our model can be combined with posterior summarization strategies to identify interesting subgroups and interpret the model fit.
翻译:因果推断文献日益认识到,明确针对处理效应异质性可以提升科学理解和政策建议质量。为达成相同目标,研究处理变量到结果变量的因果路径也具有重要意义。本文在因果中介分析框架下解决上述问题。我们提出一种基于贝叶斯加性回归树的变系数模型,用于识别并正则化异质性因果中介效应;与线性结构方程模型类似,这些效应对应于协变量依赖的系数乘积。我们证明,即使在小协变量规模的大数据集上,LSEMs对条件平均直接效应和间接效应也可能产生高度不稳定的估计,而我们的贝叶斯因果中介森林模型则能产生稳定估计。研究发现该方法具有保守性,效应估计呈现"向同质性收缩"特征。我们利用医疗支出面板调查数据和基于经验生成的模拟数据检验了该方法的关键特性。最后,我们展示了如何将本模型与后验总结策略相结合,用于识别感兴趣的子群并解释模型拟合结果。