A simple device for balancing for a continuous covariate in clinical trials is to stratify by whether the covariate is above or below some target value, typically the predicted median. This raises an issue as to which model should be used for modelling the effect of treatment on the outcome variable, $Y$. Should one fit, the stratum indicator, $S$, the continuous covariate, $X$, both or neither? When a covariate is added to a linear model there are three consequences for inference: 1) the mean square error effect, 2) the variance inflation factor and 3) second order precision. We consider that it is valuable to consider these three factors separately, even if, ultimately, it is their joint effect that matters. We present some simple theory, concentrating in particular on the variance inflation factor, that may be used to guide trialists in their choice of model. We also consider the case where the precise form of the relationship between the outcome and the covariate is not known. We conclude by recommending that the continuous covariate should always be in the model but that, depending on circumstances, there may be some justification in fitting the stratum indicator also.
翻译:在临床试验中,一种平衡连续协变量的简单方法是根据协变量是否高于或低于某个目标值(通常为预测中位数)进行分层。这引发了一个问题:应采用哪种模型来建模治疗对结局变量$Y$的影响?是应纳入分层指示变量$S$、连续协变量$X$、两者皆纳入,还是两者皆不纳入?当在线性模型中添加协变量时,会对统计推断产生三种影响:1)均方误差效应,2)方差膨胀因子,3)二阶精度。我们认为,即使最终起作用的是它们的联合效应,分别考量这三个因素仍有重要价值。我们提出一些简单理论(尤其聚焦于方差膨胀因子),可用于指导试验设计者选择模型。我们还考虑了结局与协变量间关系的确切形式未知的情况。最终我们建议:连续协变量应始终纳入模型,但根据具体情况,纳入分层指示变量也可能存在一定合理性。