We address modelling and computational issues for multiple treatment effect inference under many potential confounders. Our main contribution is providing a trade-off between preventing the omission of relevant confounders, while not running into an over-selection of instruments that significantly inflates variance. We propose a novel empirical Bayes framework for Bayesian model averaging that learns from data the prior inclusion probabilities of key covariates. Our framework sets a data-dependent prior that asymptotically matches the true amount of confounding in the data, as measured by a novel confounding coefficient. A key challenge is computational. We develop fast algorithms, using an exact gradient of the marginal likelihood that has linear cost in the number of covariates, and a variational counterpart. Our framework uses widely-used ingredients and largely existing software, and it is implemented within the R package mombf. We illustrate our work with two applications. The first is the association between salary variation and discriminatory factors. The second, that has been debated in previous works, is the association between abortion policies and crime. Our approach provides insights that differ from previous analyses especially in situations with weaker treatment effects.
翻译:我们针对存在大量潜在混杂因子的多处理效应推断问题,探讨了建模与计算方面的挑战。本研究的主要贡献在于:在防止遗漏相关混杂因子的同时,避免因工具变量过度选择而导致方差显著膨胀,从而实现了二者的平衡。我们提出了一种新颖的经验贝叶斯框架,通过贝叶斯模型平均方法从数据中学习关键协变量的先验纳入概率。该框架设定了一种数据依赖型先验,其渐近匹配数据中真实的混杂程度(通过新提出的混杂系数进行度量)。计算复杂性是本方法面临的关键挑战。我们开发了高效算法:一种利用边际似然精确梯度(其计算成本与协变量数量呈线性关系),另一种则采用变分近似版本。本框架采用广泛使用的计算组件并兼容现有软件生态,已通过R软件包mombf实现。我们通过两个应用案例展示方法的实用性:其一是薪资差异与歧视性因素之间的关联分析;其二是关于堕胎政策与犯罪率关联的争议性课题(先前研究存在分歧)。特别是在处理效应较弱的情境中,我们的方法提供了不同于以往分析的见解。