Self-supervision meets kernel graph neural models: From architecture to augmentations

Graph representation learning has now become the de facto standard when handling graph-structured data, with the framework of message-passing graph neural networks (MPNN) being the most prevailing algorithmic tool. Despite its popularity, the family of MPNNs suffers from several drawbacks such as transparency and expressivity. Recently, the idea of designing neural models on graphs using the theory of graph kernels has emerged as a more transparent as well as sometimes more expressive alternative to MPNNs known as kernel graph neural networks (KGNNs). Developments on KGNNs are currently a nascent field of research, leaving several challenges from algorithmic design and adaptation to other learning paradigms such as self-supervised learning. In this paper, we improve the design and learning of KGNNs. Firstly, we extend the algorithmic formulation of KGNNs by allowing a more flexible graph-level similarity definition that encompasses former proposals like random walk graph kernel, as well as providing a smoother optimization objective that alleviates the need of introducing combinatorial learning procedures. Secondly, we enhance KGNNs through the lens of self-supervision via developing a novel structure-preserving graph data augmentation method called latent graph augmentation (LGA). Finally, we perform extensive empirical evaluations to demonstrate the efficacy of our proposed mechanisms. Experimental results over benchmark datasets suggest that our proposed model achieves competitive performance that is comparable to or sometimes outperforming state-of-the-art graph representation learning frameworks with or without self-supervision on graph classification tasks. Comparisons against other previously established graph data augmentation methods verify that the proposed LGA augmentation scheme captures better semantics of graph-level invariance.

翻译：图表示学习现已成为处理图结构数据的事实标准，其中消息传递图神经网络（MPNN）是最主流的算法工具。尽管广受欢迎，MPNN家族仍存在透明性和表达能力等方面的缺陷。近期，利用图核理论设计图神经模型的思路作为MPNN的一种更具透明性且有时表达能力更强的替代方案逐渐兴起，称为核图神经网络（KGNN）。目前KGNN的发展仍属新兴研究领域，在算法设计及向自监督学习等其他学习范式的适配方面仍面临若干挑战。本文改进了KGNN的设计与学习机制。首先，我们扩展了KGNN的算法公式，允许更灵活的图级相似度定义（涵盖随机游走图核等先前方案），同时提供更平滑的优化目标，无需引入组合式学习过程。其次，我们通过自监督视角增强KGNN，开发了一种新颖的保持结构性的图数据增强方法——潜在图增强（LGA）。最后，我们通过大量实证评估验证所提机制的有效性。基准数据集上的实验结果表明，在有无自监督的图分类任务中，我们的模型均能达到与最先进图表示学习框架相当甚至更优的竞争性能。与现有图数据增强方法的对比验证表明，所提出的LGA增强方案能更有效地捕获图级不变性的语义。