To determine the causal effect of a treatment using observational data, it is important to balance the covariate distributions between treated and control groups. However, achieving balance can be difficult when treated and control groups lack overlap. In the presence of limited overlap, researchers typically choose between two types of methods: 1) methods (e.g., inverse propensity score weighting) that imply traditional estimands (e.g., ATE) but whose estimators are at risk of variance inflation and considerable statistical bias; and 2) methods (e.g., overlap weighting) which imply a different estimand, thereby changing the target population to reduce variance. In this work, we introduce a framework for characterizing estimands by their target populations and the statistical performance of their estimators. We introduce a bias decomposition that encapsulates bias due to 1) the statistical bias of the estimator; and 2) estimand mismatch, i.e., deviation from the population of interest. We propose a design-based estimand selection procedure that helps navigate the tradeoff between these two sources of bias and variance of the resulting estimators. Our procedure allows the analyst to incorporate their domain-specific preference for preservation of the original population versus reduction of statistical bias. We demonstrate how to select an estimand based on these preferences by applying our framework to a right heart catheterization study.
翻译:在使用观测数据确定处理的因果效应时,平衡处理组与对照组的协变量分布至关重要。然而,当处理组与对照组缺乏重叠时,实现平衡可能十分困难。在重叠有限的情况下,研究者通常需要在两类方法之间进行选择:1)暗示传统估计量(如平均处理效应)的方法(例如逆倾向得分加权),其估计量存在方差膨胀和显著统计偏差的风险;2)暗示不同估计量的方法(例如重叠加权),从而改变目标总体以降低方差。本文提出一个通过目标总体及其估计量的统计性能来刻画估计量的框架。我们引入了一种偏差分解,该分解囊括了由以下原因导致的偏差:1)估计量的统计偏差;以及2)估计量失配,即与目标总体的偏离。我们提出了一种基于设计的估计量选择程序,以帮助权衡这两种偏差来源与所得估计量的方差。该程序允许分析者纳入其领域特定的偏好,即在保持原始总体与降低统计偏差之间进行权衡。通过将我们的框架应用于一项右心导管插入术研究,我们展示了如何基于这些偏好选择估计量。