Hypergraphs are important for processing data with higher-order relationships involving more than two entities. In scenarios where explicit hypergraphs are not readily available, it is desirable to infer a meaningful hypergraph structure from the node features to capture the intrinsic relations within the data. However, existing methods either adopt simple pre-defined rules that fail to precisely capture the distribution of the potential hypergraph structure, or learn a mapping between hypergraph structures and node features but require a large amount of labelled data, i.e., pre-existing hypergraph structures, for training. Both restrict their applications in practical scenarios. To fill this gap, we propose a novel smoothness prior that enables us to design a method to infer the probability for each potential hyperedge without labelled data as supervision. The proposed prior indicates features of nodes in a hyperedge are highly correlated by the features of the hyperedge containing them. We use this prior to derive the relation between the hypergraph structure and the node features via probabilistic modelling. This allows us to develop an unsupervised inference method to estimate the probability for each potential hyperedge via solving an optimisation problem that has an analytical solution. Experiments on both synthetic and real-world data demonstrate that our method can learn meaningful hypergraph structures from data more efficiently than existing hypergraph structure inference methods.
翻译:超图在处理涉及两个以上实体间高阶关系的数据时具有重要作用。在缺乏显式超图结构的场景下,从节点特征推断有意义的超图结构以捕捉数据内在关联具有重要价值。然而现有方法要么采用简单预设规则导致无法精确刻画潜在超图结构的分布特性,要么通过学习超图结构与节点特征间的映射关系但需要大量标注数据(即预先存在的超图结构)进行训练。这两种方式均限制了其在实际场景中的应用。为解决该问题,我们提出一种新颖的平滑先验,使得无需标注数据监督即可设计推断各潜在超边概率的方法。该先验表明:超边中节点的特征与包含该超边的特征高度相关。我们基于该先验通过概率建模推导超图结构与节点特征间的关联关系,进而开发了一种无监督推断方法——通过求解具有解析解的最优化问题来估计各潜在超边的概率。在合成数据与真实数据上的实验表明,与现有超图结构推断方法相比,我们的方法能更高效地从数据中学习有意义的超图结构。