The linear regression model is widely used in the biomedical and social sciences as well as in policy and business research to adjust for covariates and estimate the average effects of treatments. Behind every causal inference endeavor there is at least a notion of a randomized experiment. However, in routine regression analyses in observational studies, it is unclear how well the adjustments made by regression approximate key features of randomization experiments, such as covariate balance, study representativeness, sample boundedness, and unweighted sampling. In this paper, we provide software to empirically address this question. In the new lmw package for R, we compute the implied linear model weights for average treatment effects and provide diagnostics for them. The weights are obtained as part of the design stage of the study; that is, without using outcome information. The implementation is general and applicable, for instance, in settings with instrumental variables and multi-valued treatments; in essence, in any situation where the linear model is the vehicle for adjustment and estimation of average treatment effects with discrete-valued interventions.
翻译:线性回归模型广泛应用于生物医学、社会科学以及政策与商业研究中,用于调整协变量并估计处理的平均效应。任何因果推断工作背后至少隐含着一个随机化实验的概念。然而,在观察性研究的常规回归分析中,回归调整在多大程度上能近似随机化实验的关键特征(如协变量平衡、研究代表性、样本有界性和未加权抽样)尚不明确。本文提供了软件工具来实证性地解答这一问题。在新的 R 包 lmw 中,我们计算了平均处理效应所隐含的线性模型权重,并提供了相应的诊断方法。这些权重是在研究设计阶段获得的,即不依赖结局信息。该实现具有通用性,可适用于例如工具变量和多值处理等场景;本质而言,它适用于任何以线性模型作为离散型干预下平均处理效应调整与估计工具的情况。