The recent contrastive learning methods, due to their effectiveness in representation learning, have been widely applied to modeling graph data. Random perturbation is widely used to build contrastive views for graph data, which however, could accidentally break graph structures and lead to suboptimal performance. In addition, graph data is usually highly abstract, so it is hard to extract intuitive meanings and design more informed augmentation schemes. Effective representations should preserve key characteristics in data and abandon superfluous information. In this paper, we propose ENGAGE (ExplaNation Guided data AuGmEntation), where explanation guides the contrastive augmentation process to preserve the key parts in graphs and explore removing superfluous information. Specifically, we design an efficient unsupervised explanation method called smoothed activation map as the indicator of node importance in representation learning. Then, we design two data augmentation schemes on graphs for perturbing structural and feature information, respectively. We also provide justification for the proposed method in the framework of information theories. Experiments of both graph-level and node-level tasks, on various model architectures and on different real-world graphs, are conducted to demonstrate the effectiveness and flexibility of ENGAGE. The code of ENGAGE can be found: https://github.com/sycny/ENGAGE.
翻译:摘要:近期,对比学习方法因其在表示学习中的有效性,被广泛应用于图数据建模。随机扰动常被用于构建图数据的对比视图,然而这可能会意外破坏图结构,导致次优性能。此外,图数据通常高度抽象,难以提取直观含义并设计更具信息性的增强方案。有效的表示应保留数据中的关键特征,并舍弃冗余信息。本文提出ENGAGE(ExplaNation Guided data AuGmEntation,解释引导的数据增强),其中解释机制引导对比增强过程,以保留图中的关键部分并探索移除冗余信息。具体而言,我们设计了一种高效的无监督解释方法——平滑激活图,作为节点在表示学习中重要性的指示器。随后,我们分别针对结构信息和特征信息设计了两种图数据增强方案。我们还从信息论框架下为所提方法提供了理论依据。通过在不同图级和节点级任务、多种模型架构以及真实世界图数据上的实验,验证了ENGAGE的有效性和灵活性。ENGAGE的代码可访问:https://github.com/sycny/ENGAGE。