Although existing variational graph autoencoders (VGAEs) have been widely used for modeling and generating graph-structured data, most of them are still not flexible enough to approximate the sparse and skewed latent node representations, especially those of document relational networks (DRNs) with discrete observations. To analyze a collection of interconnected documents, a typical branch of Bayesian models, specifically relational topic models (RTMs), has proven their efficacy in describing both link structures and document contents of DRNs, which motives us to incorporate RTMs with existing VGAEs to alleviate their potential issues when modeling the generation of DRNs. In this paper, moving beyond the sophisticated approximate assumptions of traditional RTMs, we develop a graph Poisson factor analysis (GPFA), which provides analytic conditional posteriors to improve the inference accuracy, and extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels. Then, taking GPGBN as the decoder, we combine it with various Weibull-based graph inference networks, resulting in two variants of Weibull graph auto-encoder (WGAE), equipped with model inference algorithms. Experimental results demonstrate that our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.
翻译:尽管现有的变分图自编码器(VGAEs)已广泛用于建模和生成图结构数据,但其中大多数仍不够灵活,难以近似稀疏且偏斜的潜在节点表示,尤其是具有离散观测的文档关系网络(DRNs)。为分析相互关联的文档集合,贝叶斯模型的一个典型分支——关系主题模型(RTMs)——已被证明能有效描述DRNs的链接结构和文档内容,这促使我们将RTMs与现有VGAEs相结合,以缓解其在建模DRNs生成时的潜在问题。本文超越了传统RTMs复杂的近似假设,开发了一种图泊松因子分析(GPFA),其提供了解析条件后验以提高推断精度,并将GPFA扩展为多随机层版本,称为图泊松伽马信念网络(GPGBN),以捕捉多个语义层次上的分层文档关系。然后,以GPGBN作为解码器,我们将其与多种基于威布尔分布的图推断网络相结合,形成了威布尔图自编码器(WGAE)的两个变体,并配备了模型推断算法。实验结果表明,我们的模型能够提取高质量的分层潜在文档表示,并在各种图分析任务上取得了优异的性能。