Interactions and relations between objects may be pairwise or higher-order in nature, and so network-valued data are ubiquitous in the real world. The "space of networks", however, has a complex structure that cannot be adequately described using conventional statistical tools. We introduce a measure-theoretic formalism for modeling generalized network structures such as graphs, hypergraphs, or graphs whose nodes come with a partition into categorical classes. We then propose a metric that extends the Gromov-Wasserstein distance between graphs and the co-optimal transport distance between hypergraphs. We characterize the geometry of this space, thereby providing a unified theoretical treatment of generalized networks that encompasses the cases of pairwise, as well as higher-order, relations. In particular, we show that our metric is an Alexandrov space of non-negative curvature, and leverage this structure to define gradients for certain functionals commonly arising in geometric data analysis tasks. We extend our analysis to the setting where vertices have additional label information, and derive efficient computational schemes to use in practice. Equipped with these theoretical and computational tools, we demonstrate the utility of our framework in a suite of applications, including hypergraph alignment, clustering and dictionary learning from ensemble data, multi-omics alignment, as well as multiscale network alignment.
翻译:对象间的相互作用与关系本质上可能是成对的或高阶的,因此网络值数据在现实世界中无处不在。然而,“网络空间”具有复杂的结构,无法用传统的统计工具充分描述。我们引入了一种测度论形式体系,用于建模广义网络结构,例如图、超图,或节点按类别进行分区的图。随后,我们提出了一种度量,该度量扩展了图之间的Gromov-Wasserstein距离和超图之间的协同最优传输距离。我们刻画了该空间的几何特性,从而为广义网络提供了一个统一的理论处理框架,涵盖了成对关系以及高阶关系的情形。特别地,我们证明了该度量构成一个非负曲率的Alexandrov空间,并利用此结构为几何数据分析任务中常见的某些泛函定义了梯度。我们将分析扩展到顶点具有额外标签信息的情形,并推导了实践中可用的高效计算方案。借助这些理论与计算工具,我们在一系列应用中展示了本框架的实用性,包括超图对齐、集成数据的聚类与字典学习、多组学数据对齐以及多尺度网络对齐。