We aim to learn the functional co-response group: a group of taxa whose co-response effect (the representative characteristic of the group showing the total topological abundance of taxa) co-responds (associates well statistically) to a functional variable. Different from the state-of-the-art method, we model the soil microbial community as an ecological co-occurrence network with the taxa as nodes (weighted by their abundance) and their relationships (a combination from both spatial and functional ecological aspects) as edges (weighted by the strength of the relationships). Then, we design a method called gFlora which notably uses graph convolution over this co-occurrence network to get the co-response effect of the group, such that the network topology is also considered in the discovery process. We evaluate gFlora on two real-world soil microbiome datasets (bacteria and nematodes) and compare it with the state-of-the-art method. gFlora outperforms this on all evaluation metrics, and discovers new functional evidence for taxa which were so far under-studied. We show that the graph convolution step is crucial to taxa with relatively low abundance (thus removing the bias towards taxa with higher abundance), and the discovered bacteria of different genera are distributed in the co-occurrence network but still tightly connected among themselves, demonstrating that topologically they fill different but collaborative functional roles in the ecological community.
翻译:本研究旨在识别功能共响应组:即一组分类单元,其共响应效应(代表该组分类单元总拓扑丰度的特征)与某一功能变量存在显著统计关联。不同于现有先进方法,我们将土壤微生物群落建模为一个生态共现网络,其中分类单元作为节点(以其丰度加权),它们之间的关系(结合空间和功能生态学两方面)作为边(以关系强度加权)。随后,我们设计了一种名为gFlora的方法,该方法创新性地在该共现网络上应用图卷积运算来获取群组的共响应效应,从而在发现过程中同时考虑网络拓扑结构。我们在两个真实土壤微生物组数据集(细菌和线虫)上评估gFlora,并与现有先进方法进行比较。gFlora在所有评估指标上均优于对比方法,并为迄今研究不足的分类单元发现了新的功能证据。我们证明图卷积步骤对丰度相对较低的分类单元至关重要(从而消除了对高丰度分类单元的偏差),且所发现的不同属的细菌在共现网络中分布广泛但仍保持紧密内部连接,这表明从拓扑结构上看,它们在生态群落中承担着不同但协同的功能角色。