The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train our specialized clinical large language model, Asclepius. While Asclepius is trained on synthetic data, we assess its potential performance in real-world applications by evaluating it using real clinical notes. We benchmark Asclepius against several other large language models, including GPT-3.5-turbo and other open-source alternatives. To further validate our approach using synthetic notes, we also compare Asclepius with its variants trained on real clinical notes. Our findings convincingly demonstrate that synthetic clinical notes can serve as viable substitutes for real ones when constructing high-performing clinical language models. This conclusion is supported by detailed evaluations conducted by both GPT-4 and medical professionals. All resources including weights, codes, and data used in the development of Asclepius are made publicly accessible for future research.
翻译:针对患者临床笔记定制的通用大语言模型的开发,常因严格隐私法规导致临床笔记的获取受限和可用性不足而受阻。为解决这些挑战,我们首先利用从生物医学文献中提取的公开病例报告,创建了大规模合成临床笔记。随后使用这些合成笔记训练专用临床大语言模型Asclepius。尽管该模型基于合成数据训练,但通过使用真实临床笔记进行评估,我们评估了其在实际应用中的潜在性能。我们将Asclepius与包括GPT-3.5-turbo在内的其他多种大语言模型及开源替代方案进行基准测试。为验证合成笔记方法的有效性,我们还将Asclepius与基于真实临床笔记训练的变体模型进行对比。研究结果令人信服地表明,在构建高性能临床语言模型时,合成临床笔记可作为真实笔记的有效替代方案。该结论得到经GPT-4及医疗专业人员开展的详细评估验证。Asclepius开发过程中使用的所有资源(含权重、代码及数据)均已公开,可供后续研究使用。