Graph convolutional networks (GCNs) are \emph{discriminative models} that directly model the class posterior $p(y|\mathbf{x})$ for semi-supervised classification of graph data. While being effective, as a representation learning approach, the node representations extracted from a GCN often miss useful information for effective clustering, because the objectives are different. In this work, we design normalizing flows that replace GCN layers, leading to a \emph{generative model} that models both the class conditional likelihood $p(\mathbf{x}|y)$ and the class prior $p(y)$. The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with a Gaussian mixture representation space. It enjoys two benefits: it not only maintains the predictive power of GCN, but also produces well-separated clusters, due to the structuring of the representation space. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.
翻译:图卷积网络(GCN)是直接对图数据半监督分类中的类后验概率$p(y|\mathbf{x})$进行建模的判别式模型。尽管图卷积网络作为表征学习方法效果显著,但由于目标函数的差异,其提取的节点表征往往缺乏用于有效聚类的有用信息。本研究设计了一种替代图卷积层的归一化流结构,构建了一个同时对类条件似然$p(\mathbf{x}|y)$和类先验$p(y)$进行建模的生成式模型。所提出的神经网络模型GC-Flow保留了图卷积操作并配备了高斯混合表征空间。该模型具有双重优势:既能保持GCN的预测能力,又能通过表征空间的结构化产生良好分离的聚类结果。我们在多个基准数据集上验证了这些优势。此外,我们还证明对图卷积中所用邻接矩阵等参数进行额外设计,能够进一步提升聚类性能。