Recent advances in general medical AI have made significant strides, but existing models often lack the reasoning capabilities needed for complex medical decision-making. This paper presents GMAI-VL-R1, a multimodal medical reasoning model enhanced by reinforcement learning (RL) to improve its reasoning abilities. Through iterative training, GMAI-VL-R1 optimizes decision-making, significantly boosting diagnostic accuracy and clinical support. We also develop a reasoning data synthesis method, generating step-by-step reasoning data via rejection sampling, which further enhances the model's generalization. Experimental results show that after RL training, GMAI-VL-R1 excels in tasks such as medical image diagnosis and visual question answering. While the model demonstrates basic memorization with supervised fine-tuning, RL is crucial for true generalization. Our work establishes new evaluation benchmarks and paves the way for future advancements in medical reasoning models. Code, data, and model will be released at \href{https://github.com/uni-medical/GMAI-VL-R1}{this link}.
翻译:通用医学人工智能的最新进展已取得显著成果,但现有模型通常缺乏复杂医疗决策所需的推理能力。本文提出GMAI-VL-R1,一种通过强化学习增强的多模态医学推理模型,旨在提升其推理能力。通过迭代训练,GMAI-VL-R1优化了决策过程,显著提高了诊断准确性和临床支持能力。我们还开发了一种推理数据合成方法,通过拒绝采样生成逐步推理数据,进一步增强了模型的泛化能力。实验结果表明,经过强化学习训练后,GMAI-VL-R1在医学图像诊断和视觉问答等任务中表现出色。尽管模型在监督微调下展现出基本的记忆能力,但强化学习对于实现真正的泛化至关重要。我们的工作建立了新的评估基准,并为医学推理模型的未来发展铺平了道路。代码、数据和模型将在\href{https://github.com/uni-medical/GMAI-VL-R1}{此链接}发布。