Node classification is a fundamental problem in information retrieval with many real-world applications, such as community detection in social networks, grouping articles published online and product categorization in e-commerce. Zero-shot node classification in text-attributed graphs (TAGs) presents a significant challenge, particularly due to the absence of labeled data. In this paper, we propose a novel Zero-shot Prompt Tuning (ZPT) framework to address this problem by leveraging a Universal Bimodal Conditional Generator (UBCG). Our approach begins with pre-training a graph-language model to capture both the graph structure and the associated textual descriptions of each node. Following this, a conditional generative model is trained to learn the joint distribution of nodes in both graph and text modalities, enabling the generation of synthetic samples for each class based solely on the class name. These synthetic node and text embeddings are subsequently used to perform continuous prompt tuning, facilitating effective node classification in a zero-shot setting. Furthermore, we conduct extensive experiments on multiple benchmark datasets, demonstrating that our framework performs better than existing state-of-the-art baselines. We also provide ablation studies to validate the contribution of the bimodal generator. The code is provided at: https://github.com/Sethup123/ZPT.
翻译:节点分类是信息检索领域的一个基础性问题,具有诸多现实应用,例如社交网络中的社区检测、在线发表文章的分组以及电子商务中的产品分类。文本属性图(TAGs)中的零样本节点分类提出了一个重大挑战,尤其是由于标注数据的缺失。本文提出了一种新颖的零样本提示调优(ZPT)框架,通过利用通用双模态条件生成器(UBCG)来解决这一问题。我们的方法首先预训练一个图-语言模型,以捕获图结构以及每个节点相关的文本描述。随后,训练一个条件生成模型来学习节点在图和文本两种模态下的联合分布,从而能够仅基于类别名称为每个类别生成合成样本。这些合成的节点和文本嵌入随后被用于执行连续提示调优,从而在零样本设置下实现有效的节点分类。此外,我们在多个基准数据集上进行了广泛的实验,结果表明我们的框架性能优于现有的最先进基线方法。我们还提供了消融研究以验证双模态生成器的贡献。代码发布于:https://github.com/Sethup123/ZPT。