Prompt learning has demonstrated impressive efficacy in the fine-tuning of multimodal large models to a wide range of downstream tasks. Nonetheless, applying existing prompt learning methods for the diagnosis of neurological disorder still suffers from two issues: (i) existing methods typically treat all patches equally, despite the fact that only a small number of patches in neuroimaging are relevant to the disease, and (ii) they ignore the structural information inherent in the brain connection network which is crucial for understanding and diagnosing neurological disorders. To tackle these issues, we introduce a novel prompt learning model by learning graph prompts during the fine-tuning process of multimodal large models for diagnosing neurological disorders. Specifically, we first leverage GPT-4 to obtain relevant disease concepts and compute semantic similarity between these concepts and all patches. Secondly, we reduce the weight of irrelevant patches according to the semantic similarity between each patch and disease-related concepts. Moreover, we construct a graph among tokens based on these concepts and employ a graph convolutional network layer to extract the structural information of the graph, which is used to prompt the pre-trained multimodal large models for diagnosing neurological disorders. Extensive experiments demonstrate that our method achieves superior performance for neurological disorder diagnosis compared with state-of-the-art methods and validated by clinicians.
翻译:提示学习在将多模态大模型微调至广泛下游任务方面已展现出显著成效。然而,将现有提示学习方法应用于神经系统疾病诊断仍面临两个问题:(i) 现有方法通常平等对待所有图像块,尽管神经影像中仅少数图像块与疾病相关;(ii) 这些方法忽略了脑连接网络中固有的结构信息,而该信息对于理解和诊断神经系统疾病至关重要。为解决这些问题,我们提出一种新颖的提示学习模型,通过在多模态大模型微调过程中学习图提示来实现神经系统疾病诊断。具体而言,我们首先利用GPT-4获取相关疾病概念,并计算这些概念与所有图像块之间的语义相似度。其次,根据每个图像块与疾病相关概念的语义相似度降低无关图像块的权重。此外,我们基于这些概念在令牌间构建图结构,并采用图卷积网络层提取图的结构信息,用于提示预训练的多模态大模型进行神经系统疾病诊断。大量实验表明,与现有先进方法相比,我们的方法在神经系统疾病诊断方面实现了更优性能,并获得了临床医生的验证。