In Bayesian Network Regression models, networks are considered the predictors of continuous responses. These models have been successfully used in brain research to identify regions in the brain that are associated with specific human traits, yet their potential to elucidate microbial drivers in biological phenotypes for microbiome research remains unknown. In particular, microbial networks are challenging due to their high-dimension and high sparsity compared to brain networks. Furthermore, unlike in brain connectome research, in microbiome research, it is usually expected that the presence of microbes have an effect on the response (main effects), not just the interactions. Here, we develop the first thorough investigation of whether Bayesian Network Regression models are suitable for microbial datasets on a variety of synthetic data that was generated under realistic biological scenarios. We test whether the Bayesian Network Regression model that accounts only for interaction effects (edges in the network) is able to identify key drivers in phenotypic variability (microbes). We show that this model is indeed able to identify influential nodes and edges in the microbial networks that drive changes in the phenotype for most biological settings, but we also identify scenarios where this method performs poorly which allows us to provide practical advice for domain scientists aiming to apply these tools to their datasets. Finally, we implement the model in a publicly available Julia package at https://github.com/solislemuslab/BayesianNetworkRegression.jl.
翻译:在贝叶斯网络回归模型中,网络被视为连续响应的预测变量。这类模型已成功应用于脑科学研究,用于识别与特定人类特征相关的脑区,但其在微生物组研究中阐明生物表型中微生物驱动因子的潜力尚未明确。与脑网络相比,微生物网络因其高维度与高度稀疏性而更具挑战性。此外,不同于脑连接组研究,微生物组研究通常预期微生物的存在对响应(主效应)具有影响,而不仅仅是交互作用。本文首次系统探究贝叶斯网络回归模型是否适用于基于真实生物场景生成的多种合成微生物数据集。我们检验仅考虑交互效应(网络中的边)的贝叶斯网络回归模型能否识别表型变异的关键驱动因子(微生物)。结果显示,在大多数生物情景下,该模型确实能够识别驱动表型变化的微生物网络中的关键节点与边,但我们也发现该方法在某些场景下表现不佳,从而为领域科学家应用这些工具处理自身数据集提供实用建议。最后,我们将该模型以Julia语言公开可用的代码包形式实现,地址为https://github.com/solislemuslab/BayesianNetworkRegression.jl。