Over the past years, embedding learning on networks has shown tremendous results in link prediction tasks for complex systems, with a wide range of real-life applications. Learning a representation for each node in a knowledge graph allows us to capture topological and semantic information, which can be processed in downstream analyses later. In the link prediction task, high-dimensional network information is encoded into low-dimensional vectors, which are then fed to a predictor to infer new connections between nodes in the network. As the network complexity (that is, the numbers of connections and types of interactions) grows, embedding learning turns out increasingly challenging. This review covers published models on embedding learning on multiplex networks for link prediction. First, we propose refined taxonomies to classify and compare models, depending on the type of embeddings and embedding techniques. Second, we review and address the problem of reproducible and fair evaluation of embedding learning on multiplex networks for the link prediction task. Finally, we tackle evaluation on directed multiplex networks by proposing a novel and fair testing procedure. This review constitutes a crucial step towards the development of more performant and tractable embedding learning approaches for multiplex networks and their fair evaluation for the link prediction task. We also suggest guidelines on the evaluation of models, and provide an informed perspective on the challenges and tools currently available to address downstream analyses applied to multiplex networks.
翻译:近年来,网络嵌入学习在复杂系统的链路预测任务中展现出卓越成果,并广泛应用于现实场景。通过学习知识图谱中每个节点的表示,我们能够捕获拓扑与语义信息,这些信息可在后续下游分析中进一步处理。在链路预测任务中,高维网络信息被编码为低维向量,随后输入预测器以推断网络中节点间的新连接。随着网络复杂性(即连接数量与交互类型的增加)的提升,嵌入学习变得日益具有挑战性。本综述系统回顾了已发表的面向链路预测的多重网络嵌入学习模型。首先,我们提出细化的分类体系,依据嵌入类型与嵌入技术对模型进行分类与比较。其次,我们审视并探讨了面向链路预测任务的多重网络嵌入学习在可复现性与公平评估方面的问题。最后,通过提出一种新颖且公平的测试流程,我们解决了有向多重网络的评估难题。本综述为开发性能更优、更易处理的多重网络嵌入学习方法及其在链路预测任务中的公平评估奠定了关键基础。我们同时提出模型评估的指导原则,并就当前可用于处理多重网络下游分析的挑战与工具提供前瞻性视角。