Transformers serve as the backbone architectures of Foundational Models, where domain-specific tokenizers allow them to adapt to various domains. Graph Transformers (GTs) have recently emerged as leading models in geometric deep learning, outperforming Graph Neural Networks (GNNs) in various graph learning tasks. However, the development of tokenizers for graphs has lagged behind other modalities. To address this, we introduce GQT (\textbf{G}raph \textbf{Q}uantized \textbf{T}okenizer), which decouples tokenizer training from Transformer training by leveraging multi-task graph self-supervised learning, yielding robust and generalizable graph tokens. Furthermore, the GQT utilizes Residual Vector Quantization (RVQ) to learn hierarchical discrete tokens, resulting in significantly reduced memory requirements and improved generalization capabilities. By combining the GQT with token modulation, a Transformer encoder achieves state-of-the-art performance on 20 out of 22 benchmarks, including large-scale homophilic and heterophilic datasets.
翻译:Transformer 作为基础模型的核心架构,其领域特定的分词器使其能够适应不同领域。图 Transformer(GT)最近已成为几何深度学习中的主导模型,在各种图学习任务中超越了图神经网络(GNN)。然而,图分词器的发展滞后于其他模态。为此,我们提出了 GQT(图量化分词器),它通过利用多任务图自监督学习,将分词器训练与 Transformer 训练解耦,从而产生鲁棒且可泛化的图词元。此外,GQT 利用残差向量量化(RVQ)来学习层次化的离散词元,显著降低了内存需求并提升了泛化能力。通过将 GQT 与词元调制相结合,一个 Transformer 编码器在 22 个基准测试中的 20 个上取得了最先进的性能,包括大规模同配性和异配性数据集。