Graph pooling has been increasingly recognized as crucial for Graph Neural Networks (GNNs) to facilitate hierarchical graph representation learning. Existing graph pooling methods commonly consist of two stages: selecting top-ranked nodes and discarding the remaining to construct coarsened graph representations. However, this paper highlights two key issues with these methods: 1) The process of selecting nodes to discard frequently employs additional Graph Convolutional Networks or Multilayer Perceptrons, lacking a thorough evaluation of each node's impact on the final graph representation and subsequent prediction tasks. 2) Current graph pooling methods tend to directly discard the noise segment (dropped) of the graph without accounting for the latent information contained within these elements. To address the first issue, we introduce a novel Graph Explicit Pooling (GrePool) method, which selects nodes by explicitly leveraging the relationships between the nodes and final representation vectors crucial for classification. The second issue is addressed using an extended version of GrePool (i.e., GrePool+), which applies a uniform loss on the discarded nodes. This addition is designed to augment the training process and improve classification accuracy. Furthermore, we conduct comprehensive experiments across 12 widely used datasets to validate our proposed method's effectiveness, including the Open Graph Benchmark datasets. Our experimental results uniformly demonstrate that GrePool outperforms 14 baseline methods for most datasets. Likewise, implementing GrePool+ enhances GrePool's performance without incurring additional computational costs.
翻译:图池化已被广泛认为是图神经网络(GNN)实现层次化图表示学习的关键技术。现有图池化方法通常包含两个阶段:选择排名靠前的节点并丢弃剩余节点以构建粗化图表示。然而,本文揭示了这些方法存在的两个关键问题:1)节点选择丢弃过程常采用额外的图卷积网络或多层感知机,缺乏对每个节点最终图表示及后续预测任务影响的全面评估;2)当前图池化方法倾向于直接丢弃图中的噪声部分(被弃节点),而未考虑这些元素中蕴含的潜在信息。针对第一个问题,我们提出了一种新颖的图显式池化方法(GrePool),通过显式利用节点与最终分类关键表示向量之间的关系进行节点选择。针对第二个问题,我们采用GrePool的扩展版本(即GrePool+),对弃用节点施加统一的损失函数。该附加机制旨在增强训练过程并提升分类精度。此外,我们在包括Open Graph Benchmark数据集在内的12个广泛使用的数据集上开展了全面实验以验证所提方法的有效性。实验结果表明,GrePool在大多数数据集上优于14种基线方法。同时,GrePool+在不增加额外计算成本的情况下进一步提升了GrePool的性能。