Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion. This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark. In the multilingual medical benchmark, the released Apollo models, at various relatively-small sizes (i.e., 0.5B, 1.8B, 2B, 6B, and 7B), achieve the best performance among models of equivalent size. Especially, Apollo-7B is the state-of-the-art multilingual medical LLMs up to 70B. Additionally, these lite models could be used to improve the multi-lingual medical capabilities of larger models without fine-tuning in a proxy-tuning fashion. We will open-source training corpora, code, model weights and evaluation benchmark.
翻译:尽管全球医学知识主要存储在英语中,但在医疗资源有限的地区,本地语言对提供个性化医疗服务至关重要。为使医疗人工智能进展惠及更广泛人群,我们致力于开发覆盖全球61亿人口的六种最广泛使用语言的医疗大语言模型。这项研究最终构建了ApolloCorpora多语言医疗数据集和XMedBench基准测试。在多语言医疗基准测试中,已发布的Apollo模型(涵盖0.5B、1.8B、2B、6B和7B等相对较小参数量)在同等规模模型中取得了最优性能。特别地,Apollo-7B是高达70B参数规模中最先进的多语言医疗大语言模型。此外,这些轻量级模型可通过代理调优的方式,在不进行微调的情况下提升更大模型的多语言医疗能力。我们将开源训练语料、代码、模型权重及评估基准。