Large Language Models (LLMs) have rapidly evolved from text-based systems to multimodal platforms, significantly impacting various sectors including healthcare. This comprehensive review explores the progression of LLMs to Multimodal Large Language Models (MLLMs) and their growing influence in medical practice. We examine the current landscape of MLLMs in healthcare, analyzing their applications across clinical decision support, medical imaging, patient engagement, and research. The review highlights the unique capabilities of MLLMs in integrating diverse data types, such as text, images, and audio, to provide more comprehensive insights into patient health. We also address the challenges facing MLLM implementation, including data limitations, technical hurdles, and ethical considerations. By identifying key research gaps, this paper aims to guide future investigations in areas such as dataset development, modality alignment methods, and the establishment of ethical guidelines. As MLLMs continue to shape the future of healthcare, understanding their potential and limitations is crucial for their responsible and effective integration into medical practice.
翻译:大型语言模型已从基于文本的系统迅速发展为多模态平台,对包括医疗保健在内的多个领域产生了显著影响。本综述全面探讨了大型语言模型向多模态大型语言模型的演进过程及其在医疗实践中日益增长的影响力。我们审视了多模态大型语言模型在医疗领域的当前发展态势,分析了其在临床决策支持、医学影像、患者参与及科学研究等方面的应用。本综述重点阐述了多模态大型语言模型在整合文本、图像、音频等多样化数据类型方面的独特能力,这些能力为患者健康状况提供了更全面的洞察。同时,我们也探讨了多模态大型语言模型实施过程中面临的挑战,包括数据局限性、技术障碍及伦理考量。通过识别关键研究空白,本文旨在为未来研究方向提供指引,涵盖数据集开发、模态对齐方法及伦理准则建立等领域。随着多模态大型语言模型持续塑造医疗健康的未来,充分理解其潜力与局限对于其在医疗实践中实现负责任且有效的整合至关重要。