We address modelling and computational issues for multiple treatment effect inference under many potential confounders. A primary issue relates to preventing harmful effects from omitting relevant covariates (under-selection), while not running into over-selection issues that introduce substantial variance and a bias related to the non-random over-inclusion of covariates. We propose a novel empirical Bayes framework for Bayesian model averaging that learns from data the extent to which the inclusion of key covariates should be encouraged, specifically those highly associated to the treatments. A key challenge is computational. We develop fast algorithms, including an Expectation-Propagation variational approximation and simple stochastic gradient optimization algorithms, to learn the hyper-parameters from data. Our framework uses widely-used ingredients and largely existing software, and it is implemented within the R package mombf featured on CRAN. This work is motivated by and is illustrated in two applications. The first is the association between salary variation and discriminatory factors. The second, that has been debated in previous works, is the association between abortion policies and crime. Our approach provides insights that differ from previous analyses especially in situations with weaker treatment effects.
翻译:我们解决了在众多潜在混杂因子下进行多重治疗效果推断时的建模与计算问题。核心挑战在于:既要避免遗漏相关协变量(欠选择)带来的有害影响,又要防止因过度选择引发的大方差及非随机过度包含协变量导致的偏差。我们提出了一种新颖的经验贝叶斯框架用于贝叶斯模型平均,该框架能从数据中学习应对哪些关键协变量(特别是与治疗高度相关的协变量)给予更高的纳入权重。计算环节是主要瓶颈。我们开发了快速算法,包括期望传播变分近似和简单的随机梯度优化算法,用于从数据中学习超参数。本框架采用广泛使用的技术组件与现有软件工具,已集成至CRAN上的R包mombf中。本研究的动机源自两个实际应用场景:其一是薪资差异与歧视性因素的关联分析;其二是之前文献中存有争议的堕胎政策与犯罪率之间的关联。我们的方法在治疗效果较弱的情况下,得出了与既往分析不同的见解。