We present a practical guide for the analysis of regression discontinuity (RD) designs in biomedical contexts. We begin by introducing key concepts, assumptions, and estimands within both the continuity-based framework and the local randomization framework. We then discuss modern estimation and inference methods within both frameworks, including approaches for bandwidth or local neighborhood selection, optimal treatment effect point estimation, and robust bias-corrected inference methods for uncertainty quantification. We also overview empirical falsification tests that can be used to support key assumptions. Our discussion focuses on two particular features that are relevant in biomedical research: (i) fuzzy RD designs, which often arise when therapeutic treatments are based on clinical guidelines, but patients with scores near the cutoff are treated contrary to the assignment rule; and (ii) RD designs with discrete scores, which are ubiquitous in biomedical applications. We illustrate our discussion with three empirical applications: the effect CD4 guidelines for anti-retroviral therapy on retention of HIV patients in South Africa, the effect of genetic guidelines for chemotherapy on breast cancer recurrence in the United States, and the effects of age-based patient cost-sharing on healthcare utilization in Taiwan. Complete replication materials employing publicly available statistical software in Python, R and Stata are provided, offering researchers all necessary tools to conduct an RD analysis.
翻译:我们提供了一份在生物医学背景下进行断点回归(RD)分析的实用指南。首先,我们介绍了连续框架和局部随机化框架内的关键概念、假设和估计量。随后,我们探讨了两种框架下的现代估计与推断方法,包括带宽或局部邻域选择、最优处理效应点估计,以及用于不确定性量化的稳健偏差校正推断方法。我们还概述了可用来支持关键假设的实证证伪检验。讨论重点聚焦于生物医学研究中两个相关特征:(i) 模糊断点回归设计——当治疗基于临床指南,而评分靠近临界值的患者未遵循分配规则接受治疗时经常出现;(ii) 离散评分断点回归设计——这在生物医学应用中普遍存在。我们通过三个实证应用说明讨论内容:南非基于CD4指南的抗逆转录病毒治疗对HIV患者保留率的影响、美国基于遗传指南的化疗对乳腺癌复发的影响,以及中国台湾基于年龄的医疗费用分担对医疗保健利用的影响。文中提供了采用Python、R和Stata中公开可用统计软件的完整复现材料,为研究者开展RD分析提供所有必要工具。