Large Language Models (LLMs) have rapidly evolved from text-based systems to multimodal platforms, significantly impacting various sectors including healthcare. This comprehensive review explores the progression of LLMs to Multimodal Large Language Models (MLLMs) and their growing influence in medical practice. We examine the current landscape of MLLMs in healthcare, analyzing their applications across clinical decision support, medical imaging, patient engagement, and research. The review highlights the unique capabilities of MLLMs in integrating diverse data types, such as text, images, and audio, to provide more comprehensive insights into patient health. We also address the challenges facing MLLM implementation, including data limitations, technical hurdles, and ethical considerations. By identifying key research gaps, this paper aims to guide future investigations in areas such as dataset development, modality alignment methods, and the establishment of ethical guidelines. As MLLMs continue to shape the future of healthcare, understanding their potential and limitations is crucial for their responsible and effective integration into medical practice.
翻译:大型语言模型(LLMs)已从基于文本的系统迅速发展为多模态平台,对包括医疗保健在内的多个领域产生了显著影响。本综述全面探讨了LLMs向多模态大型语言模型(MLLMs)的演进过程及其在医疗实践中日益增长的影响力。我们审视了MLLMs在医疗领域的当前格局,分析了其在临床决策支持、医学影像、患者互动及科学研究等方面的应用。本文重点阐述了MLLMs在整合文本、图像、音频等多种数据类型以提供更全面的患者健康洞察方面的独特能力。同时,我们也探讨了MLLMs实施过程中面临的挑战,包括数据局限性、技术障碍及伦理考量。通过识别关键研究空白,本文旨在为未来研究方向提供指引,例如数据集开发、模态对齐方法以及伦理准则的建立。随着MLLMs持续塑造医疗保健的未来,充分理解其潜力与局限对于其在医疗实践中负责任且有效的整合至关重要。