G-formula is a popular approach for estimating treatment or exposure effects from longitudinal data that are subject to time-varying confounding. G-formula estimation is typically performed by Monte-Carlo simulation, with non-parametric bootstrapping used for inference. We show that G-formula can be implemented by exploiting existing methods for multiple imputation (MI) for synthetic data. This involves using an existing modified version of Rubin's variance estimator. In practice missing data is ubiquitous in longitudinal datasets. We show that such missing data can be readily accommodated as part of the MI procedure, and describe how MI software can be used to implement the approach. We explore its performance using a simulation study.
翻译:G公式是一种从具有时变混杂效应的纵向数据中估计治疗或暴露效应的常用方法。通常采用蒙特卡洛仿真进行G公式估计,并利用非参数自助法进行推断。本文证明可通过利用现有针对合成数据的多变量插补方法来实施G公式。该方法需要使用鲁宾方差估计量的修改版本。在实际应用中,纵向数据集普遍存在数据缺失问题。研究表明,这类缺失数据可被便捷地整合到多变量插补流程中,并描述了如何利用多变量插补软件实现该方法。我们通过仿真研究评估了其性能。