We address modelling and computational issues for multiple treatment effect inference under many potential confounders. Our main contribution is providing a trade-off between preventing the omission of relevant confounders, while not running into an over-selection of instruments that significantly inflates variance. We propose a novel empirical Bayes framework for Bayesian model averaging that learns from data the extent to which the inclusion of key covariates should be encouraged. Our framework sets a prior that asymptotically matches the true amount of confounding in the data, as measured by a novel confounding coefficient. A key challenge is computational. We develop fast algorithms, using an exact gradient of the marginal likelihood that has linear cost in the number of covariates, and a variational counterpart. Our framework uses widely-used ingredients and largely existing software, and it is implemented within the R package mombf. We illustrate our work with two applications. The first is the association between salary variation and discriminatory factors. The second, that has been debated in previous works, is the association between abortion policies and crime. Our approach provides insights that differ from previous analyses especially in situations with weaker treatment effects.
翻译:本文针对存在大量潜在混杂因子的多处理效应推断问题,探讨了建模与计算层面的挑战。我们的核心贡献在于:在防止遗漏相关混杂因子的同时,避免因工具变量过度选择导致方差显著膨胀,从而实现了二者的权衡。我们提出了一种新颖的经验贝叶斯框架,通过贝叶斯模型平均方法,从数据中学习关键协变量应被纳入模型的鼓励程度。该框架设置的先验分布能够渐近匹配数据中真实的混杂程度(通过新提出的混杂系数进行度量)。计算效率是本研究的关键挑战。我们开发了快速算法:利用边际似然函数在协变量数量上具有线性计算成本的精确梯度方法,以及其变分推断版本。本框架采用广泛使用的计算组件与现有软件工具,并已在R软件包mombf中实现。我们通过两个应用案例展示该方法:一是薪资差异与歧视性因素之间的关联分析;二是先前研究中存在争议的堕胎政策与犯罪率关联研究。在处理效应较弱的情境下,我们的方法提供了与既往分析不同的新见解。