The linear regression model is widely used in the biomedical and social sciences as well as in policy and business research to adjust for covariates and estimate the average effects of treatments. Behind every causal inference endeavor there is a hypothetical randomized experiment. However, in routine regression analyses in observational studies, it is unclear how well the adjustments made by regression approximate key features of randomized experiments, such as covariate balance, study representativeness, sample boundedness, and unweighted sampling. In this paper, we provide software to empirically address this question. We introduce the lmw package for R to compute the implied linear model weights and perform diagnostics for their evaluation. The weights are obtained as part of the design stage of the study; that is, without using outcome information. The implementation is general and applicable, for instance, in settings with instrumental variables and multi-valued treatments; in essence, in any situation where the linear model is the vehicle for adjustment and estimation of average treatment effects with discrete-valued interventions.
翻译:线性回归模型广泛应用于生物医学、社会科学以及政策与商业研究中,用于调整协变量并估计处理变量的平均效应。每一项因果推断工作背后都隐含着一个假设的随机实验。然而,在观察性研究的常规回归分析中,尚不明确回归调整在多大程度上近似随机实验的关键特征,例如协变量平衡、研究代表性、样本有界性以及未加权抽样。本文提供了实证解决该问题的软件工具。我们介绍了R语言的lmw软件包,用于计算隐含的线性模型权重并进行诊断评估。这些权重是在研究设计阶段获得的,即无需使用结果信息。该实现具有通用性,适用于多种场景,例如工具变量和多值处理变量的情形;本质上,适用于任何以线性模型作为离散干预变量平均处理效应调整与估计工具的场合。