Large-scale pre-trained models have been known that they are transferable, and they generalize well on the unseen dataset. Recently, multimodal pre-trained models such as CLIP show significant performance improvement in diverse experiments. However, when the labeled dataset is limited, the generalization of a new dataset or domain is still challenging. To improve the generalization performance on few-shot learning, there have been diverse efforts, such as prompt learning and adapter. However, the current few-shot adaptation methods are not interpretable, and they require a high computation cost for adaptation. In this study, we propose a new method, robust prompt learning with knowledge graph (RPLKG). Based on the knowledge graph, we automatically design diverse interpretable and meaningful prompt sets. Our model obtains cached embeddings of prompt sets after one forwarding from a large pre-trained model. After that, model optimizes the prompt selection processes with GumbelSoftmax. In this way, our model is trained using relatively little memory and learning time. Also, RPLKG selects the optimal interpretable prompt automatically, depending on the dataset. In summary, RPLKG is i) interpretable, ii) requires small computation resources, and iii) easy to incorporate prior human knowledge. To validate the RPLKG, we provide comprehensive experimental results on few-shot learning, domain generalization and new class generalization setting. RPLKG shows a significant performance improvement compared to zero-shot learning and competitive performance against several prompt learning methods using much lower resources.
翻译:大规模预训练模型已被证实具有良好的可迁移性,能在未见数据集上实现较好的泛化。近年来,CLIP等多模态预训练模型在各类实验中展现出显著的性能提升。然而,当标注数据集有限时,模型在新数据集或新领域上的泛化能力仍面临挑战。为提升少样本学习中的泛化性能,研究者已提出多种方法,例如提示学习(prompt learning)和适配器(adapter)。然而,当前的少样本自适应方法缺乏可解释性,且计算成本高昂。本研究提出一种新方法——基于知识图谱的鲁棒提示学习(RPLKG)。该方法利用知识图谱自动设计多样化、可解释且富有意义的提示集。模型在大型预训练模型完成一次前向传播后,获取提示集的缓存嵌入,随后通过GumbelSoftmax优化提示选择过程。这种方式使得模型训练所需内存和学习时间相对较少。此外,RPLKG能根据数据集自动选择最优的可解释提示。综上,RPLKG具有以下优势:i)可解释性,ii)较低的计算资源需求,iii)易于融入人类先验知识。为验证RPLKG的有效性,我们在少样本学习、领域泛化及新类别泛化等场景下开展了全面的实验。相比零样本学习,RPLKG展现出显著的性能提升;与多种提示学习方法相比,在消耗资源大幅降低的同时,取得了具有竞争力的性能。