Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves these goals by (i) learning a distance metric via outcome modeling, (ii) creating matched groups using the distance metric, and (iii) using the matched groups to estimate treatment effects. Model-to-Match uses variable importance measurements to construct a distance metric, making it a flexible framework that can be adapted to various applications. Concentrating on the scalability of the problem in the number of potential confounders, we operationalize the Model-to-Match framework with LASSO. We derive performance guarantees for settings where LASSO outcome modeling consistently identifies all confounders (importantly without requiring the linear model to be correctly specified). We also provide experimental results demonstrating the method's auditability, accuracy, and scalability as well as extensions to more general nonparametric outcome modeling.
翻译:本研究旨在开发一种可用于观察性因果推断的方法,该方法需具备可审计性、易于调试、治疗效应估计准确,且能适应高维数据。我们提出一个名为Model-to-Match的通用框架,通过以下步骤实现上述目标:(i)通过结果建模学习距离度量;(ii)利用该距离度量创建匹配组;(iii)使用匹配组估计治疗效应。Model-to-Match借助变量重要性测量构建距离度量,使其成为一个可灵活适应各类应用的通用框架。针对潜变量数量增加时问题的可扩展性需求,我们采用LASSO对Model-to-Match框架进行具体实现。在LASSO结果建模能稳定识别所有混淆变量(且无需线性模型完全正确设定)的条件下,我们推导出其性能保证。实验结果表明,该方法在可审计性、准确性和可扩展性方面表现优异,并可扩展至更广泛的非参数结果建模场景。