We address modelling and computational issues for multiple treatment effect inference under many potential confounders. A primary issue relates to preventing harmful effects from omitting relevant covariates (under-selection), while not running into over-selection issues that introduce substantial variance and a bias related to the non-random over-inclusion of covariates. We propose a novel empirical Bayes framework for Bayesian model averaging that learns from data the extent to which the inclusion of key covariates should be encouraged, specifically those highly associated to the treatments. A key challenge is computational. We develop fast algorithms, including an Expectation-Propagation variational approximation and simple stochastic gradient optimization algorithms, to learn the hyper-parameters from data. Our framework uses widely-used ingredients and largely existing software, and it is implemented within the R package mombf featured on CRAN. This work is motivated by and is illustrated in two applications. The first is the association between salary variation and discriminatory factors. The second, that has been debated in previous works, is the association between abortion policies and crime. Our approach provides insights that differ from previous analyses especially in situations with weaker treatment effects.
翻译:我们解决了在存在大量潜在混淆因子的情况下,进行多重处理效应推断时的建模与计算问题。一个核心问题在于既要防止因遗漏相关协变量(欠选择)而产生有害影响,又要避免因过度选择导致引入显著方差以及与非随机过度包含协变量相关的偏差。我们提出了一种新颖的经验贝叶斯框架,用于贝叶斯模型平均,该框架能从数据中学习应鼓励关键协变量(尤其是与处理高度相关的协变量)纳入模型的程度。主要挑战在于计算方面。我们开发了快速算法,包括期望传播变分近似和简单的随机梯度优化算法,以从数据中学习超参数。本框架使用广泛采用的组件且大部分基于现有软件,已在CRAN上的R包mombf中实现。本文的研究动机源于两个应用案例,并以此进行阐释。第一个应用是薪资差异与歧视性因素之间的关联。第二个应用是堕胎政策与犯罪率之间的关联,这也是以往研究中有争议的话题。我们的方法提供了与以往分析不同的见解,尤其是在处理效应较弱的情况下。