Hypergraphs are important for processing data with higher-order relationships involving more than two entities. In scenarios where explicit hypergraphs are not readily available, it is desirable to infer a meaningful hypergraph structure from the node features to capture the intrinsic relations within the data. However, existing methods either adopt simple pre-defined rules that fail to precisely capture the distribution of the potential hypergraph structure, or learn a mapping between hypergraph structures and node features but require a large amount of labelled data, i.e., pre-existing hypergraph structures, for training. Both restrict their applications in practical scenarios. To fill this gap, we propose a novel smoothness prior that enables us to design a method to infer the probability for each potential hyperedge without labelled data as supervision. The proposed prior indicates features of nodes in a hyperedge are highly correlated by the features of the hyperedge containing them. We use this prior to derive the relation between the hypergraph structure and the node features via probabilistic modelling. This allows us to develop an unsupervised inference method to estimate the probability for each potential hyperedge via solving an optimisation problem that has an analytical solution. Experiments on both synthetic and real-world data demonstrate that our method can learn meaningful hypergraph structures from data more efficiently than existing hypergraph structure inference methods.
翻译:超图对于处理涉及两个以上实体的高阶关系数据至关重要。在显式超图不易获取的场景中,需要从节点特征推断有意义的超图结构以捕捉数据内在关联。然而,现有方法要么采用简单的预定义规则而无法精确捕捉潜在超图结构的分布,要么学习超图结构与节点特征间的映射关系但需要大量标注数据(即预先存在的超图结构)进行训练。这两种局限都制约了其在实际场景中的应用。为填补这一空白,我们提出一种新颖的平滑先验,使我们能够设计一种无需标注数据监督即可推断每个潜在超边概率的方法。该先验表明:超边中节点的特征通过包含这些节点的超边特征高度关联。我们利用该先验通过概率建模推导超图结构与节点特征之间的关系,从而开发出一种无监督推断方法,通过求解具有解析解的优化问题来估计每个潜在超边的概率。在合成数据与真实数据上的实验表明,相较于现有超图结构推断方法,本方法能够更高效地从数据中学习有意义的超图结构。