Shariat et al previously investigated the possibility of predicting from clinical data (including Gleason grade and stage) and preoperative biomarkers, which of any pair of patients would suffer recurrence of prostate cancer first. We wished to establish the extent to which predictions of time of relapse from such a model could be improved upon using Bayesian methods. The same dataset was reanalysed with a Bayesian skew-Student mixture model. Predictions were made of which of any pair of patients would relapse first and of the time of relapse. The benefit of using these biomarkers relative to predictions made without them was measured by the apparent Shannon information, using as prior an exponential attrition model of relapse time independent of input variables. Using half the dataset for training and the other half for testing, predictions of relapse time from the strict Cox model gave $-\infty$ nepers of apparent Shannon information (it predicts that relapse can only occur at times when patients in the training set relapsed). Deliberately smoothed predictions from the Cox model gave -0.001 (-0.131 to +0.120) nepers, while the Bayesian model gave +0.109 (+0.021 to +0.192) nepers (mean, 2.5 to 97.5 centiles), being positive with posterior probability 0.993 and beating the blurred Cox model with posterior probability 0.927. These predictions from the Bayesian model thus outperform those of the Cox model, but the overall yield of predictive information leaves scope for improvement of the range of biomarkers in use. The Bayesian model presented here is the first such model for prostate cancer to consider the variation of relapse hazard with biomarker concentrations to be smooth, as is intuitive. It is also the first to be shown to provide more apparent Shannon information than the Cox model or to be shown to provide positive apparent information relative to an exponential prior.
翻译:Shariat等人先前研究了利用临床数据(包括格里森分级和分期)及术前生物标志物预测任意患者对中哪一方会先出现前列腺癌复发的可能性。我们旨在确定通过贝叶斯方法能在多大程度上改进此类模型对复发时间的预测能力。本研究采用贝叶斯偏斜t混合模型对相同数据集进行了再分析。模型实现了对任意患者对复发优先顺序及具体复发时间的双重预测。通过以独立于输入变量的指数衰减复发时间模型作为先验,采用表观香农信息度量了使用这些生物标志物相较于无标志物预测的相对优势。将数据集均分为训练集与测试集,严格Cox模型对复发时间的预测呈现$-\infty$奈培的表观香农信息(该模型仅能预测训练集中患者出现复发的时点)。经刻意平滑处理的Cox模型预测值为-0.001(-0.131至+0.120)奈培,而贝叶斯模型则达到+0.109(+0.021至+0.192)奈培(均值及2.5至97.5百分位数),其正预测性的后验概率为0.993,以0.927的后验概率优于模糊化Cox模型。因此,贝叶斯模型的预测性能超越Cox模型,但整体预测信息产出量表明现有生物标志物体系仍有改进空间。本文提出的贝叶斯模型是首个将复发风险随生物标志物浓度的变化视为连续平滑过程的前列腺癌预测模型,这更符合直观认知。该模型同时也是首个被证实能提供比Cox模型更高表观香农信息,以及在指数先验下产生正表观信息的模型。