Both cluster randomized trials and quasi-experimental designs are used to evaluate the impact of health and social policies and interventions. Stepped-wedge cluster randomized trials randomize a staggered adoption approach, while recent difference-in-differences methods allow analysis of non-randomized settings where similar policies are adopted at different time points. These approaches have become common, but the sheer variety of methods for analyzing observational studies with staggered adoption makes it challenging to clearly design and report such studies. We propose that observational and quasi-experimental study investigators can address these challenges by emulating stepped-wedge cluster randomized trials in the target trial emulation framework. The conceptual framework and reporting standards of trial emulation will encourage consideration of key features of these designs, such as policy heterogeneity and time-varying effects, and clear reporting of the estimand and assumptions. It also highlights areas where those interested in randomized trials and quasi-experimental designs can benefit from one another's experience by bringing insights across disciplines. Questions of treatment effect heterogeneity, power, spillovers, and anticipation effects, among others, are common to both fields and can benefit from cross-pollination. This article also demonstrates how trial emulation can identify settings that are not well-served by either approach, thereby avoiding studies unlikely to generate high-quality causal evidence. Finally, it informs the bias-variance-generalizability trade-off that arises with design and analysis choices made in these settings, supporting better evidence generation and interpretation in settings where important questions can be answered.
翻译:集群随机试验和准实验设计均被用于评估健康与社会政策及干预措施的效果。阶梯楔形集群随机试验通过随机分配逐步实施的时间点,而近期发展的双重差分方法则可分析在非随机化情境下、不同时间点采纳相似政策的情况。这些方法已广泛应用,但由于分析具有逐步实施特征的观察性研究的方法繁多,使得清晰设计和报告此类研究颇具挑战。我们提出,观察性与准实验研究者可通过目标试验模拟框架模拟阶梯楔形集群随机试验来应对这些挑战。试验模拟的概念框架和报告标准将促进对政策异质性和时变效应等关键设计特征的考量,并明确报告目标估计量及假设。这不仅凸显了随机试验与准实验设计研究者可从跨学科经验中获益的领域,还揭示了治疗效应异质性、统计功效、溢出效应及预期效应等共通问题,这些问题可通过交叉融合得到进一步深化。本文还展示了试验模拟如何识别单一方法效力不足的情境,从而避免产生低质量因果证据的研究。最终,本文阐明了在不同设计与分析选择中产生的偏差-方差-可推广性权衡,为能回答重要问题情境下的高质量证据生成与解读提供支持。