Gaussian graphical models (GGMs) are widely used to recover the conditional independence structure among random variables. Recent work has sought to incorporate auxiliary covariates to improve estimation, particularly in applications such as co-expression quantitative trait locus (eQTL) studies, where both gene expression levels and their conditional dependence structure may be influenced by genetic variants. Existing approaches to covariate-adjusted GGMs either restrict covariate effects to the mean structure or lead to nonconvex formulations when jointly estimating the mean and precision matrix. In this paper, we propose a convex framework that simultaneously estimates the covariate-adjusted mean and precision matrix via a natural parametrization of the multivariate Gaussian likelihood. The resulting formulation enables joint convex optimization and yields improved theoretical guarantees under high-dimensional scaling, where the sparsity and dimension of covariates grow with the sample size. We support our theoretical findings with numerical simulations and demonstrate the practical utility of the proposed method through a reanalysis of an eQTL study of glioblastoma multiforme and an analysis of diet on the human gut microbiome.
翻译:高斯图模型(GGM)被广泛用于恢复随机变量之间的条件独立结构。最近的研究尝试引入辅助协变量以改进估计,特别是在共表达数量性状位点(eQTL)研究等应用中,基因表达水平及其条件依赖结构可能均受遗传变异影响。现有的协变量调整GGM方法要么将协变量效应限制在均值结构上,要么在联合估计均值和精度矩阵时导致非凸形式。本文提出一个凸框架,通过多元高斯似然的自然参数化同时估计协变量调整后的均值和精度矩阵。该框架支持联合凸优化,并在高维尺度下(协变量稀疏性和维度随样本量增长)提供改进的理论保证。我们通过数值模拟支持理论结果,并通过重新分析多形性胶质母细胞瘤的eQTL研究以及饮食对人类肠道微生物组的影响分析,展示了所提方法的实际效用。