In heterogeneous disease settings, accounting for intrinsic sample variability is crucial for obtaining reliable and interpretable omic network estimates. However, most graphical model analyses of biomedical data assume homogeneous conditional dependence structures, potentially leading to misleading conclusions. To address this, we propose a joint Gaussian graphical model that leverages sample-level ordinal covariates (e.g., disease stage) to account for heterogeneity and improve the estimation of partial correlation structures. Our modelling framework, called NExON-Bayes, extends the graphical spike-and-slab framework to account for ordinal covariates, jointly estimating their relevance to the graph structure and leveraging them to improve the accuracy of network estimation. To scale to high-dimensional omic settings, we develop an efficient variational inference algorithm tailored to our model. Through simulations, we demonstrate that our method outperforms the vanilla graphical spike-and-slab (with no covariate information), as well as other state-of-the-art network approaches which exploit covariate information. Applying our method to reverse phase protein array data from patients diagnosed with stage I, II or III breast carcinoma, we estimate the behaviour of proteomic networks as breast carcinoma progresses. Our model provides insights not only through inspection of the estimated proteomic networks, but also of the estimated ordinal covariate dependencies of key groups of proteins within those networks, offering a comprehensive understanding of how biological pathways shift across disease stages. Availability and Implementation: A user-friendly R package for NExON-Bayes with tutorials is available on Github at github.com/jf687/NExON.
翻译:在异质性疾病背景下,考虑样本固有的变异性对于获得可靠且可解释的组学网络估计至关重要。然而,大多数针对生物医学数据的图模型分析都假设条件依赖结构是同质的,这可能导致误导性结论。为解决此问题,我们提出了一种联合高斯图模型,该模型利用样本层面的序数协变量(如疾病分期)来考虑异质性,并改进偏相关结构的估计。我们的建模框架(称为NExON-Bayes)扩展了图形尖峰-厚板框架,以纳入序数协变量,联合估计它们与图结构的相关性,并利用它们提高网络估计的准确性。为适应高维组学场景,我们开发了一种针对本模型的高效变分推断算法。通过模拟实验,我们证明本方法优于未使用协变量信息的原始图形尖峰-厚板模型,以及其他利用协变量信息的先进网络方法。将本方法应用于来自诊断为I、II或III期乳腺癌患者的反相蛋白阵列数据,我们估计了蛋白质组网络随乳腺癌进展的行为。我们的模型不仅通过检查估计的蛋白质组网络提供洞见,还通过分析这些网络中关键蛋白质组的估计序数协变量依赖性,全面揭示了生物通路如何随疾病阶段演变。可用性与实现:用户友好的NExON-Bayes R软件包及教程可在Github上获取:github.com/jf687/NExON。