Analyzing multivariate count data generated by high-throughput sequencing technology in microbiome research studies is challenging due to the high-dimensional and compositional structure of the data and overdispersion. In practice, researchers are often interested in investigating how the microbiome may mediate the relation between an assigned treatment and an observed phenotypic response. Existing approaches designed for compositional mediation analysis are unable to simultaneously determine the presence of direct effects, relative indirect effects, and overall indirect effects, while quantifying their uncertainty. We propose a formulation of a Bayesian joint model for compositional data that allows for the identification, estimation, and uncertainty quantification of various causal estimands in high-dimensional mediation analysis. We conduct simulation studies and compare our method's mediation effects selection performance with existing methods. Finally, we apply our method to a benchmark data set investigating the sub-therapeutic antibiotic treatment effect on body weight in early-life mice.
翻译:分析微生物组研究中由高通量测序技术生成的多元计数数据具有挑战性,原因在于数据的高维和成分结构以及过度离散性。在实践中,研究人员常关注微生物组如何介导指定处理与观察到的表型响应之间的关系。现有专为成分介导分析设计的方法无法同时确定直接效应、相对间接效应和整体间接效应的存在性,同时量化其不确定性。我们提出了一种针对成分数据的贝叶斯联合模型公式,该模型能够识别、估计并量化高维介导分析中各种因果估计量的不确定性。通过模拟研究,我们将方法的介导效应选择性能与现有方法进行了比较。最后,我们将该方法应用于一个基准数据集,探究亚治疗剂量抗生素处理对早期小鼠体重的影响。