Analyzing multivariate count data generated by high-throughput sequencing technology in microbiome research studies is challenging due to the high-dimensional and compositional structure of the data and overdispersion. In practice, researchers are often interested in investigating how the microbiome may mediate the relation between an assigned treatment and an observed phenotypic response. Existing approaches designed for compositional mediation analysis are unable to simultaneously determine the presence of direct effects, relative indirect effects, and overall indirect effects, while quantifying their uncertainty. We propose a formulation of a Bayesian joint model for compositional data that allows for the identification, estimation, and uncertainty quantification of various causal estimands in high-dimensional mediation analysis. We conduct simulation studies and compare our method's mediation effects selection performance with existing methods. Finally, we apply our method to a benchmark data set investigating the sub-therapeutic antibiotic treatment effect on body weight in early-life mice.
翻译:微生物组研究中高通量测序技术产生的多变量计数数据分析具有挑战性,原因在于数据的高维度、成分结构以及过度离散性。实践中,研究者常关注微生物组如何介导给定处理与观测表型响应之间的关系。现有针对成分中介分析的方法无法同时确定直接效应、相对间接效应和整体间接效应的存在性,同时量化其不确定性。我们提出一种针对成分数据的贝叶斯联合模型公式,该模型能够实现高维中介分析中各类因果估计量的识别、估计及不确定性量化。我们开展模拟研究,将本方法的中介效应选择性能与现有方法进行比较。最后,我们将该方法应用于探索亚治疗剂量抗生素处理对新生小鼠体重影响的基准数据集。