Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, yield accurate treatment effect estimates, and scalable to high-dimensional data. We describe an almost-exact matching approach that achieves these goals by (i) learning a distance metric via outcome modeling, (ii) creating matched groups using the distance metric, and (iii) using the matched groups to estimate treatment effects. Our proposed method uses variable importance measurements to construct a distance metric, making it a flexible method that can be adapted to various applications. Concentrating on the scalability of the problem in the number of potential confounders, we operationalize our approach with LASSO. We derive performance guarantees for settings where LASSO outcome modeling consistently identifies all confounders (importantly without requiring the linear model to be correctly specified). We also provide experimental results demonstrating the auditability of matches, as well as extensions to more general nonparametric outcome modeling.
翻译:我们的目标是开发用于观测性因果推断的方法,这些方法应具备可审计性、易于调试、能提供准确的处理效应估计,并且可扩展至高维数据。我们描述了一种近似精确匹配方法,通过以下步骤实现这些目标:(i)通过结果建模学习距离度量,(ii)利用该距离度量创建匹配组,(iii)使用匹配组估计处理效应。所提出的方法利用变量重要性度量来构建距离度量,使其成为一种灵活适应不同应用场景的方法。针对潜在混杂变量数量导致的扩展性问题,我们通过LASSO实现了该方法。我们推导了在LASSO结果建模能够一致识别所有混杂变量(关键是不需要线性模型被正确指定)的情况下的性能保证。我们还提供了实验结果,展示了匹配的可审计性,以及对更一般非参数结果建模的扩展。