Multivariable Bidirectional Mendelian Randomization via Bayesian Directed Cyclic Graphical Models with Correlated Errors

Mendelian randomization (MR) is a pivotal tool in genetics, genomics, and epidemiology, leveraging genetic variants as instrumental variables to infer causal relationships between exposures and outcomes. Traditional MR methods, while powerful, often rely on stringent assumptions such as the absence of feedback loops, which are frequently violated in complex biological networks. In addition, many popular MR approaches focus on only two variables (i.e., one exposure and one outcome), whereas our motivating applications of gene regulatory networks have many variables. In this article, we introduce a novel Bayesian framework for multivariable MR that concurrently addresses unmeasured confounding and feedback loops. Central to our approach is a sparse conditional cyclic graphical model with a sparse error variance-covariance matrix. Two structural priors are employed to enable the modeling and inference of causal relationships as well as latent confounding structures. Our method is designed to operate effectively with summary-level data, facilitating its application in contexts where individual-level data are inaccessible, e.g., due to privacy concerns. It can also account for horizontal pleiotropy, under which we establish the sufficient identifiability conditions. Through extensive simulations and applications to the GTEx and OneK1K data, we demonstrate the superior performance of our approach in recovering biologically plausible causal relationships in the presence of possible feedback loops and unmeasured confounding. Using posterior samples, we further quantify uncertainty in inferred network motifs by computing their posterior probabilities. The R package MR.RGM that implements the proposed method is available on CRAN (https://cran.r-project.org/package=MR.RGM).

翻译：孟德尔随机化（MR）是遗传学、基因组学和流行病学中的关键工具，它利用遗传变异作为工具变量来推断暴露因素与结局之间的因果关系。传统MR方法虽然强大，但通常依赖于严格的假设（如不存在反馈回路），而这些假设在复杂生物网络中常被违背。此外，许多主流MR方法仅关注两个变量（即一个暴露因素和一个结局），而我们的目标应用——基因调控网络——涉及多个变量。本文提出了一种新颖的多变量MR贝叶斯框架，能够同时处理未测混杂因素和反馈回路。我们方法的核心是采用稀疏误差方差-协方差矩阵的稀疏条件循环图模型。通过引入两种结构先验，实现了对因果关系及潜在混杂结构的建模与推断。本方法专为在汇总级数据上高效运行而设计，适用于因隐私等问题无法获取个体级数据的场景。该方法还能处理水平多效性，并在此条件下建立了充分的识别条件。通过对GTEx和OneK1K数据进行大量模拟与实际应用，我们证明了本方法在存在潜在反馈回路和未测混杂因素的情况下，能够以优越性能恢复生物学上合理的因果关系。利用后验样本，我们进一步通过计算后验概率来量化推断网络基序的不确定性。实现该方法的R包MR.RGM已在CRAN发布（https://cran.r-project.org/package=MR.RGM）。