Graphical models are powerful tools to investigate complex dependency structures in high-throughput datasets. However, most existing graphical models make one of the two canonical assumptions: (i) a homogeneous graph with a common network for all subjects; or (ii) an assumption of normality especially in the context of Gaussian graphical models. Both assumptions are restrictive and can fail to hold in certain applications such as proteomic networks in cancer. To this end, we propose an approach termed robust Bayesian graphical regression (rBGR) to estimate heterogeneous graphs for non-normally distributed data. rBGR is a flexible framework that accommodates non-normality through random marginal transformations and constructs covariate-dependent graphs to accommodate heterogeneity through graphical regression techniques. We formulate a new characterization of edge dependencies in such models called conditional sign independence with covariates along with an efficient posterior sampling algorithm. In simulation studies, we demonstrate that rBGR outperforms existing graphical regression models for data generated under various levels of non-normality in both edge and covariate selection. We use rBGR to assess proteomic networks across two cancers: lung and ovarian, to systematically investigate the effects of immunogenic heterogeneity within tumors. Our analyses reveal several important protein-protein interactions that are differentially impacted by the immune cell abundance; some corroborate existing biological knowledge whereas others are novel findings.
翻译:图模型是探究高通量数据集中复杂依赖结构的强大工具。然而,现有大多数图模型存在以下两种典型假设之一:(i)所有个体共享同一网络的同质性图;(ii)尤其在高斯图模型背景下的正态性假设。这两种假设均具有局限性,在癌症蛋白质组网络等特定应用中可能不成立。为此,我们提出一种名为鲁棒贝叶斯图回归(rBGR)的方法,用于估计非正态分布数据的异质性图。rBGR是一个灵活框架,通过随机边际变换适应非正态性,并利用图回归技术构建协变量依赖图以容纳异质性。我们为此类模型中的边依赖关系提出新表征,称为条件符号独立性与协变量相结合,并设计了一种高效的后验采样算法。在模拟研究中,我们证明对于在不同非正态性水平下生成的数据,rBGR在边选择和协变量选择方面均优于现有图回归模型。我们利用rBGR评估肺癌与卵巢癌两种癌症的蛋白质组网络,系统探究肿瘤内免疫原性异质性的影响。分析揭示了多个受免疫细胞丰度差异影响的重要蛋白质-蛋白质相互作用:部分结果印证了现有生物学知识,其余则为新发现。