Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion. This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark. In the multilingual medical benchmark, the released Apollo models, at various relatively-small sizes (i.e., 0.5B, 1.8B, 2B, 6B, and 7B), achieve the best performance among models of equivalent size. Especially, Apollo-7B is the state-of-the-art multilingual medical LLMs up to 70B. Additionally, these lite models could be used to improve the multi-lingual medical capabilities of larger models without fine-tuning in a proxy-tuning fashion. We will open-source training corpora, code, model weights and evaluation benchmark.
翻译:尽管全球医学知识库主要使用英语,但本地语言对于提供定制化医疗服务至关重要,特别是在医疗资源有限的地区。为了将医疗人工智能的进步惠及更广泛的人群,我们致力于开发覆盖全球使用最广泛的六种语言的医疗大语言模型,服务对象达61亿人口。这项工作最终形成了ApolloCorpora多语言医疗数据集和XMedBench基准测试集。在多语言医疗基准测试中,我们发布的不同较小规模(即0.5B、1.8B、2B、6B和7B)的阿波罗模型,在同等规模模型中取得了最佳性能。特别是Apollo-7B,在高达70B参数规模的模型中,是目前最先进的多语言医疗大语言模型。此外,这些轻量级模型可通过代理调优方式,无需微调即可提升更大模型的多语言医疗能力。我们将开源训练语料库、代码、模型权重和评估基准。