BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains

Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, offering potential applications across specialized domains such as healthcare and medicine. Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges. In this paper, we introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. We conduct a comprehensive evaluation of BioMistral on a benchmark comprising 10 established medical question-answering (QA) tasks in English. We also explore lightweight models obtained through quantization and model merging approaches. Our results demonstrate BioMistral's superior performance compared to existing open-source medical models and its competitive edge against proprietary counterparts. Finally, to address the limited availability of data beyond English and to assess the multilingual generalization of medical LLMs, we automatically translated and evaluated this benchmark into 7 other languages. This marks the first large-scale multilingual evaluation of LLMs in the medical domain. Datasets, multilingual evaluation benchmarks, scripts, and all the models obtained during our experiments are freely released.

翻译：近年来，大语言模型（LLMs）展现出卓越的通用性，在医疗健康等专业领域具有潜在的应用前景。尽管已有多种为健康领域定制的开源LLMs，但将通用LLMs适配到医学领域仍面临重大挑战。本文介绍了BioMistral，一个专为生物医学领域定制的开源LLM。该模型以Mistral为基础模型，并在PubMed Central上进行了进一步的预训练。我们在一个包含10项成熟英文医学问答（QA）任务的基准测试上对BioMistral进行了全面评估。同时，我们还探索了通过量化和模型合并方法获得的轻量化模型。实验结果表明，与现有的开源医学模型相比，BioMistral具有更优的性能，并与专有模型相比具备竞争优势。最后，针对非英语数据稀缺的问题，并为了评估医学LLMs的多语言泛化能力，我们将此基准测试自动翻译并评估了其他7种语言版本。这标志着医学领域首次大规模的多语言LLMs评估。我们在实验中使用的数据集、多语言评估基准、脚本以及获得的所有模型均已公开发布。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日