Recent advancements in Artificial Intelligence (AI) have profoundly influenced medical fields, by providing tools to reduce clinical workloads. However, most AI models are constrained to execute unimodal tasks, in stark contrast to the comprehensive approaches utilized by medical professionals. To address this, here we present RO-LMM, a multi-purpose large multimodal model (LMM) tailored for the field of radiation oncology. This model covers series of tasks within clinical workflow, adept at clinical report summarization, radiation treatment plan suggestion, and plan-guided target volume segmentation. In particular, to perform consecutive clinical tasks, we further present a novel Consistency Embedding Fine-Tuning (CEFTune) technique, which boosts LMM's robustness to noisy inputs while preserving the capability of handling clean inputs, and transform this concept into LMM-driven segmentation framework as Consistency Embedding Segmentation~(CESEG). Experimental results on multi-centre cohorts demonstrate our RO-LMM's promising performance for multiple clinical tasks with generalization capabilities.
翻译:近期人工智能的进展通过提供降低临床工作负担的工具,深刻影响了医学领域。然而,大多数AI模型局限于执行单模态任务,这与医学专业人员采用的综合性方法形成鲜明对比。为此,本文提出RO-LMM——一种专为放射肿瘤学领域定制的大规模多模态模型。该模型涵盖临床工作流程中的系列任务,擅长临床报告总结、放射治疗方案建议及方案引导的靶区分割。针对连续临床任务,我们进一步提出新颖的"一致性嵌入微调"技术,该技术既增强了LMM对噪声输入的鲁棒性,又保留了处理干净输入的能力,并将其转化为LMM驱动的分割框架——一致性嵌入分割。多中心队列的实验结果表明,RO-LMM在多项临床任务中展现出具有泛化能力的优异性能。