We propose a method to represent bipartite networks using graph embeddings tailored to tackle the challenges of studying ecological networks, such as the ones linking plants and pollinators, where many covariates need to be accounted for, in particular to control for sampling bias. We adapt the variational graph auto-encoder approach to the bipartite case, which enables us to generate embeddings in a latent space where the two sets of nodes are positioned based on their probability of connection. We translate the fairness framework commonly considered in sociology in order to address sampling bias in ecology. By incorporating the Hilbert-Schmidt independence criterion (HSIC) as an additional penalty term in the loss we optimize, we ensure that the structure of the latent space is independent of continuous variables, which are related to the sampling process. Finally, we show how our approach can change our understanding of ecological networks when applied to the Spipoll data set, a citizen science monitoring program of plant-pollinator interactions to which many observers contribute, making it prone to sampling bias.
翻译:本文提出一种利用图嵌入表示二分网络的方法,该方法专门针对生态网络(如植物与传粉者关联网络)研究中的挑战而设计。此类网络需考虑众多协变量,尤其需要控制抽样偏差。我们将变分图自编码器框架适配至二分图场景,从而能够在潜在空间中生成嵌入表示,使得两类节点基于其连接概率进行空间定位。我们借鉴社会学中常用的公平性框架来解决生态学中的抽样偏差问题。通过将希尔伯特-施密特独立性准则(HSIC)作为附加惩罚项纳入优化目标函数,我们确保潜在空间的结构与抽样过程相关的连续变量保持独立。最后,通过将本方法应用于Spipoll数据集——一个由众多观察者参与的植物-传粉者互作公民科学监测项目(该数据集易受抽样偏差影响),我们展示了该方法如何改变对生态网络的认知。