Fully Bayesian methods for Cox models specify a model for the baseline hazard function. Parametric approaches generally provide monotone estimations. Semi-parametric choices allow for more flexible patterns but they can suffer from overfitting and instability. Regularization methods through prior distributions with correlated structures usually give reasonable answers to these types of situations. We discuss Bayesian regularization for Cox survival models defined via flexible baseline hazards specified by a mixture of piecewise constant functions and by a cubic B-spline function. For those "semiparametric" proposals, different prior scenarios ranging from prior independence to particular correlated structures are discussed in a real study with micro-virulence data and in an extensive simulation scenario that includes different data sample and time axis partition sizes in order to capture risk variations. The posterior distribution of the parameters was approximated using Markov chain Monte Carlo methods. Model selection was performed in accordance with the Deviance Information Criteria and the Log Pseudo-Marginal Likelihood. The results obtained reveal that, in general, Cox models present great robustness in covariate effects and survival estimates independent of the baseline hazard specification. In relation to the "semi-parametric" baseline hazard specification, the B-splines hazard function is less dependent on the regularization process than the piecewise specification because it demands a smaller time axis partition to estimate a similar behaviour of the risk.
翻译:全贝叶斯方法处理Cox模型时需为基线风险函数设定模型。参数化方法通常提供单调估计,半参数选择虽能呈现更灵活的模式,但可能面临过拟合和不稳定性问题。通过具有相关结构的先验分布进行正则化,通常能为这类情形提供合理解决方案。本文探讨了针对Cox生存模型的贝叶斯正则化方法,这些模型通过分段常数函数混合与三次B样条函数定义灵活基线风险。针对上述"半参数"方案,我们讨论了从先验独立到特定相关结构的不同先验情景,并在包含微毒力数据的真实研究及涵盖不同数据样本量和时间轴分区规模(以捕捉风险变化)的广泛模拟场景中展开分析。参数的后验分布通过马尔可夫链蒙特卡洛方法进行近似,模型选择依据偏差信息准则和对数伪边际似然执行。结果表明:总体而言,Cox模型在协变量效应和生存估计中展现出极大稳健性,且不依赖于基线风险的具体设定。就"半参数"基线风险设定而言,B样条风险函数对正则化过程的依赖程度低于分段常数设定——因其仅需更小的时间轴分区即可估计相似的风险行为。