Regression calibration as developed by Rosner, Spiegelman and Willet is used to correct the bias in effect estimates due to measurement error in continuous exposures. The method involves two models: a measurement error model (MEM) relating the mismeasured exposure to the true exposure and an outcome model relating the mismeasured exposure to outcome. However, no comprehensive guidance exists for determining which covariates should be included in each model. In this paper, we investigate the selection of the minimal and most efficient covariate adjustment sets under a causal inference framework. We show that in order to correct for the measurement error, researchers must adjust for, in both MEM and outcome model, any common causes (1) of true exposure and the outcome and (2) of measurement error and the outcome. When such variable(s) are only available in the main study, researchers should still adjust for them in the outcome model to reduce bias, provided that these covariates are at most weakly associated with measurement error. We also show that adjusting for so called prognostic variables that are independent of true exposure and measurement error in outcome model, may increase efficiency, while adjusting for any covariates that are associated only with true exposure generally results in efficiency loss in realistic settings. We apply the proposed covariate selection approach to the Health Professional Follow-up Study dataset to study the effect of fiber intake on cardiovascular disease. Finally, we extend the originally proposed estimators to a non-parametric setting where effect modification by covariates is allowed.
翻译:Rosner、Spiegelman和Willet提出的回归校准方法用于校正连续暴露变量测量误差所导致的效应估计偏倚。该方法包含两个模型:将含测量误差的暴露变量与真实暴露变量相关联的测量误差模型(MEM),以及将含测量误差的暴露变量与结局变量相关联的结局模型。然而,目前尚缺乏关于如何确定各模型应包含哪些协变量的系统性指导。本文基于因果推断框架,研究最小且最有效协变量调整集的选择问题。我们证明:为校正测量误差,研究者必须在MEM和结局模型中同时调整两类共同原因:(1)真实暴露与结局的共同原因,(2)测量误差与结局的共同原因。当此类变量仅存在于主研究中时,若其与测量误差至多存在弱关联,研究者仍应在结局模型中对它们进行调整以降低偏倚。我们还发现,在结局模型中调整与真实暴露及测量误差均无关的所谓预后变量可能提升效率,而在实际场景中调整仅与真实暴露相关的协变量通常会导致效率损失。我们将所提出的协变量选择方法应用于健康专业人员随访研究数据集,以分析膳食纤维摄入对心血管疾病的影响。最后,我们将原始估计量扩展至允许协变量效应修饰的非参数设定。