Graphs are ubiquitous in real-world scenarios and encompass a diverse range of tasks, from node-, edge-, and graph-level tasks to transfer learning. However, designing specific tasks for each type of graph data is often costly and lacks generalizability. Recent endeavors under the "Pre-training + Fine-tuning" or "Pre-training + Prompt" paradigms aim to design a unified framework capable of generalizing across multiple graph tasks. Among these, graph autoencoders (GAEs), generative self-supervised models, have demonstrated their potential in effectively addressing various graph tasks. Nevertheless, these methods typically employ multi-stage training and require adaptive designs, which on one hand make it difficult to be seamlessly applied to diverse graph tasks and on the other hand overlook the negative impact caused by discrepancies in task objectives between the different stages. To address these challenges, we propose GA^2E, a unified adversarially masked autoencoder capable of addressing the above challenges seamlessly. Specifically, GA^2E proposes to use the subgraph as the meta-structure, which remains consistent across all graph tasks (ranging from node-, edge-, and graph-level to transfer learning) and all stages (both during training and inference). Further, GA^2E operates in a \textbf{"Generate then Discriminate"} manner. It leverages the masked GAE to reconstruct the input subgraph whilst treating it as a generator to compel the reconstructed graphs resemble the input subgraph. Furthermore, GA^2E introduces an auxiliary discriminator to discern the authenticity between the reconstructed (generated) subgraph and the input subgraph, thus ensuring the robustness of the graph representation through adversarial training mechanisms. We validate GA^2E's capabilities through extensive experiments on 21 datasets across four types of graph tasks.
翻译:图在现实场景中无处不在,涵盖了从节点级、边级、图级任务到迁移学习的多样化任务。然而,为每种图数据设计特定任务通常成本高昂且缺乏泛化性。近期在“预训练+微调”或“预训练+提示”范式下的研究旨在设计能跨多种图任务泛化的统一框架。其中,图自编码器(GAEs)作为生成式自监督模型,已展现出有效处理多种图任务的潜力。然而,这些方法通常采用多阶段训练并需要自适应设计,一方面难以无缝应用于各类图任务,另一方面忽视了不同阶段任务目标差异带来的负面影响。为解决上述挑战,我们提出GA^2E,一种统一的对抗掩蔽自编码器,能够无缝应对上述问题。具体而言,GA^2E提出将子图作为元结构,该结构在所有图任务(涵盖节点级、边级、图级到迁移学习)和所有阶段(包括训练和推理)中保持一致。此外,GA^2E采用“先生成后判别”模式运行,利用掩蔽GAE重构输入子图,同时将其视为生成器以迫使重构图与输入子图相似。更进一步,GA^2E引入辅助判别器来鉴别重构(生成)子图与输入子图的真实性,从而通过对抗训练机制确保图表示的鲁棒性。我们通过在21个数据集上针对四类图任务的广泛实验验证了GA^2E的能力。