Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

Graph pre-training has achieved remarkable success in recent years, delivering transferable representations for downstream adaptation. However, most existing methods are designed for either homogeneous or heterogeneous graphs, thereby hindering unified graph modeling across diverse graph types. This separation contradicts real-world applications, where mixed homogeneous and heterogeneous graphs are ubiquitous, and distribution shifts between upstream pre-training and downstream deployment are common. In this paper, we empirically demonstrate that a balanced mixture of homogeneous and heterogeneous graph pre-training benefits downstream tasks and propose a unified multi-domain \textbf{G}raph \textbf{P}re-training method across \textbf{H}omogeneous and \textbf{H}eterogeneous graphs ($\mathbf{GPH^{2}}$). To address the lack of a unified encoder for homogeneous and heterogeneous graphs, we propose a Unified Multi-View Graph Construction that simultaneously encodes both without explicit graph-type-specific designs. To cope with the increased cross-domain distribution discrepancies arising from mixed graphs, we introduce domain-specific expert encoding. Each expert is independently pre-trained on a single graph to capture domain-specific knowledge, thereby shielding the pre-training encoder from the adverse effects of cross-domain discrepancies. For downstream tasks, we further design a Task-oriented Expert Fusion Strategy that adaptively integrates multiple experts based on their discriminative strengths. Extensive experiments on mixed graphs demonstrate that $\text{GPH}^{2}$ enables stable transfer across graph types and domains, significantly outperforming existing graph pre-training methods.

翻译：近年来，图预训练取得了显著成功，为下游任务提供了可迁移的表示。然而，现有方法大多仅针对同质图或异质图设计，从而阻碍了跨不同图类型的统一建模。这种分离与现实应用场景相矛盾，因为混合的同质与异质图普遍存在，且上游预训练与下游部署之间的分布偏移也较为常见。本文通过实证研究表明，均衡融合同质与异质图的预训练有利于下游任务，并提出一种面向同质与异质图的统一多领域图预训练方法（$\mathbf{GPH^{2}}$）。针对同质图与异质图缺乏统一编码器的问题，我们提出统一多视图图构建方法，无需显式的图类型特定设计即可同时编码二者。为应对混合图带来的跨领域分布差异增大问题，我们引入领域特定专家编码机制。每个专家在单一图上独立预训练以捕获领域特定知识，从而保护预训练编码器免受跨领域差异的负面影响。针对下游任务，我们进一步设计面向任务的专家融合策略，根据各专家的判别能力自适应地整合多个专家。在混合图上的大量实验表明，$\text{GPH}^{2}$能够实现跨图类型与领域的稳定迁移，显著优于现有图预训练方法。