The Causal Roadmap outlines a systematic approach to asking and answering questions of cause-and-effect: define quantity of interest, evaluate needed assumptions, conduct statistical estimation, and carefully interpret results. It is paramount that the algorithm for statistical estimation and inference be carefully pre-specified to optimize its expected performance for the specific real-data application. Simulations that realistically reflect the application, including key characteristics such as strong confounding and dependent or missing outcomes, can help us gain a better understanding of an estimator's applied performance. We illustrate this with two examples, using the Causal Roadmap and realistic simulations to inform estimator selection and full specification of the Statistical Analysis Plan. First, in an observational longitudinal study, outcome-blind simulations are used to inform nuisance parameter estimation and variance estimation for longitudinal targeted maximum likelihood estimation (TMLE). Second, in a cluster-randomized controlled trial with missing outcomes, treatment-blind simulations are used to ensure control for Type-I error in Two-Stage TMLE. In both examples, realistic simulations empower us to pre-specify an estimator that is expected to have strong finite sample performance and also yield quality-controlled computing code for the actual analysis. Together, this process helps to improve the rigor and reproducibility of our research.
翻译:《因果路线图》提出了一套系统性的因果问题推断框架:定义目标量、评估必要假设、进行统计估计、并谨慎解读结果。关键步骤在于预先精确指定统计估计与推断算法,以优化其在特定真实数据场景中的预期表现。通过构建能真实反映应用场景(包括强混杂、依赖性或缺失结局等关键特征)的模拟实验,可加深对估计量实际性能的理解。本文通过两个案例展示如何运用因果路线图及现实模拟指导估计量选择与统计分析计划的完整制定。第一个案例中,针对观察性纵向研究,采用结局盲法模拟优化纵向定向极大似然估计(TMLE)的 nuisance 参数估计与方差估计;第二个案例中,针对存在结局缺失的整群随机对照试验,采用治疗盲法模拟控制两阶段TMLE的第一类错误。两个案例均表明,现实模拟使我们能预先指定具有优异有限样本性能的估计量,并生成经质量控制的实际分析代码。这一流程有效提升了研究的严谨性与可重复性。