Heterogeneous Information Networks (HINs), which consist of various types of nodes and edges, have recently demonstrated excellent performance in graph mining. However, most existing heterogeneous graph neural networks (HGNNs) ignore the problems of missing attributes, inaccurate attributes and scarce labels for nodes, which limits their expressiveness. In this paper, we propose a generative self-supervised model GraMI to address these issues simultaneously. Specifically, GraMI first initializes all the nodes in the graph with a low-dimensional representation matrix. After that, based on the variational graph autoencoder framework, GraMI learns both node-level and attribute-level embeddings in the encoder, which can provide fine-grained semantic information to construct node attributes. In the decoder, GraMI reconstructs both links and attributes. Instead of directly reconstructing raw features for attributed nodes, GraMI generates the initial low-dimensional representation matrix for all the nodes, based on which raw features of attributed nodes are further reconstructed to leverage accurate attributes. In this way, GraMI can not only complete informative features for non-attributed nodes, but rectify inaccurate ones for attributed nodes. Finally, we conduct extensive experiments to show the superiority of GraMI in tackling HINs with missing and inaccurate attributes.
翻译:异质信息网络由多种类型的节点与边构成,近年来在图挖掘任务中展现出卓越性能。然而,现有的大多数异质图神经网络忽略了节点属性缺失、属性不准确以及标签稀缺等问题,这限制了其表达能力。本文提出一种生成式自监督模型GraMI以同时解决这些问题。具体而言,GraMI首先通过低维表示矩阵初始化图中所有节点。随后,基于变分图自编码器框架,GraMI在编码器中同时学习节点级与属性级嵌入,从而为节点属性构建提供细粒度语义信息。在解码器中,GraMI同时重构链接与属性。对于具有属性的节点,GraMI并非直接重构原始特征,而是为所有节点生成初始低维表示矩阵,并基于该矩阵进一步重构已标注节点的原始特征以利用准确属性。通过这种方式,GraMI不仅能为无属性节点补全信息特征,还能修正有属性节点的不准确特征。最后,我们通过大量实验验证了GraMI在处理具有缺失与不准确属性的异质信息网络时的优越性。