Prompt learning has demonstrated impressive efficacy in the fine-tuning of multimodal large models to a wide range of downstream tasks. Nonetheless, applying existing prompt learning methods for the diagnosis of neurological disorder still suffers from two issues: (i) existing methods typically treat all patches equally, despite the fact that only a small number of patches in neuroimaging are relevant to the disease, and (ii) they ignore the structural information inherent in the brain connection network which is crucial for understanding and diagnosing neurological disorders. To tackle these issues, we introduce a novel prompt learning model by learning graph prompts during the fine-tuning process of multimodal large models for diagnosing neurological disorders. Specifically, we first leverage GPT-4 to obtain relevant disease concepts and compute semantic similarity between these concepts and all patches. Secondly, we reduce the weight of irrelevant patches according to the semantic similarity between each patch and disease-related concepts. Moreover, we construct a graph among tokens based on these concepts and employ a graph convolutional network layer to extract the structural information of the graph, which is used to prompt the pre-trained multimodal large models for diagnosing neurological disorders. Extensive experiments demonstrate that our method achieves superior performance for neurological disorder diagnosis compared with state-of-the-art methods and validated by clinicians.
翻译:提示学习在将多模态大模型微调至各种下游任务中展现出了显著效果。然而,将现有提示学习方法应用于神经系统疾病诊断仍面临两个问题:(i)现有方法通常平等对待所有图像块,而神经影像中仅少数图像块与疾病相关;(ii)它们忽略了大脑连接网络固有的结构信息,而这对于理解和诊断神经系统疾病至关重要。为解决这些问题,我们提出了一种新颖的提示学习模型,在多模态大模型微调过程中学习图提示用于诊断神经系统疾病。具体而言,我们首先利用GPT-4获取相关疾病概念,并计算这些概念与所有图像块之间的语义相似度。其次,根据每个图像块与疾病相关概念的语义相似度降低无关图像块的权重。此外,我们基于这些概念在令牌之间构建图,并利用图卷积网络层提取图的结构信息,以此提示预训练的多模态大模型进行神经系统疾病诊断。大量实验表明,我们的方法在神经系统疾病诊断上取得了优于现有最优方法的性能,并得到了临床医生的验证。