We introduce Baichuan-M3, a medical-enhanced large language model engineered to shift the paradigm from passive question-answering to active, clinical-grade decision support. Addressing the limitations of existing systems in open-ended consultations, Baichuan-M3 utilizes a specialized training pipeline to model the systematic workflow of a physician. Key capabilities include: (i) proactive information acquisition to resolve ambiguity; (ii) long-horizon reasoning that unifies scattered evidence into coherent diagnoses; and (iii) adaptive hallucination suppression to ensure factual reliability. Empirical evaluations demonstrate that Baichuan-M3 achieves state-of-the-art results on HealthBench, the newly introduced HealthBench-Hallu and ScanBench, significantly outperforming GPT-5.2 in clinical inquiry, advisory and safety. The models are publicly available at https://huggingface.co/collections/baichuan-inc/baichuan-m3.
翻译:我们推出Baichuan-M3,这是一款医学增强型大语言模型,旨在将范式从被动问答转向主动的、临床级的决策支持。针对现有系统在开放式问诊中的局限性,Baichuan-M3采用专门训练流程来建模医生的系统性工作流。其核心能力包括:(i)主动信息获取以消除歧义;(ii)长程推理,将分散证据整合为连贯诊断;(iii)自适应幻觉抑制以确保事实可靠性。实证评估表明,Baichuan-M3在新推出的HealthBench、HealthBench-Hallu和ScanBench基准上取得最先进成果,在临床问询、咨询建议与安全性方面显著优于GPT-5.2。模型已在https://huggingface.co/collections/baichuan-inc/baichuan-m3公开。