Heterogeneous graphs are ubiquitous data structures that can inherently capture multi-type and multi-modal interactions between objects. In recent years, research on encoding heterogeneous graph into latent representations have enjoyed a rapid increase. However, its reverse process, namely how to construct heterogeneous graphs from underlying representations and distributions have not been well explored due to several challenges in 1) modeling the local heterogeneous semantic distribution; 2) preserving the graph-structured distributions over the local semantics; and 3) characterizing the global heterogeneous graph distributions. To address these challenges, we propose a novel framework for heterogeneous graph generation (HGEN) that jointly captures the semantic, structural, and global distributions of heterogeneous graphs. Specifically, we propose a heterogeneous walk generator that hierarchically generates meta-paths and their path instances. In addition, a novel heterogeneous graph assembler is developed that can sample and combine the generated meta-path instances (e.g., walks) into heterogeneous graphs in a stratified manner. Theoretical analysis on the preservation of heterogeneous graph patterns by the proposed generation process has been performed. Extensive experiments on multiple real-world and synthetic heterogeneous graph datasets demonstrate the effectiveness of the proposed HGEN in generating realistic heterogeneous graphs.
翻译:异构图是一种固有的数据结构,能够自然地捕捉对象之间的多种类型和多种模态的交互。近年来,将异构图编码为潜在表示的研究取得了快速发展。然而,其逆过程——即如何从潜在表示和分布中构建异构图,由于以下挑战尚未得到充分探索:1)建模局部异构语义分布;2)保持局部语义上的图结构分布;3)描述全局异构图分布。为应对这些挑战,我们提出了一种新颖的异构图生成框架(HGEN),该框架联合捕捉异构图的语义、结构和全局分布。具体而言,我们提出了一种异构游走生成器,能够分层生成元路径及其路径实例。此外,开发了一种新颖的异构图组装器,可以以分层方式对生成的元路径实例(例如游走)进行采样和组合,从而构建异构图。我们从理论上分析了所提出生成过程对异构图模式的保持能力。在多个真实世界和合成异构图数据集上的大量实验表明,所提出的HGEN在生成逼真的异构图方面具有有效性。