Large language models have exhibited exceptional performance on various Natural Language Processing (NLP) tasks, leveraging techniques such as the pre-training, and instruction fine-tuning. Despite these advances, their effectiveness in medical applications is limited, due to challenges such as factual inaccuracies, reasoning abilities, and lack grounding in real-world experience. In this study, we present ClinicalGPT, a language model explicitly designed and optimized for clinical scenarios. By incorporating extensive and diverse real-world data, such as medical records, domain-specific knowledge, and multi-round dialogue consultations in the training process, ClinicalGPT is better prepared to handle multiple clinical task. Furthermore, we introduce a comprehensive evaluation framework that includes medical knowledge question-answering, medical exams, patient consultations, and diagnostic analysis of medical records. Our results demonstrate that ClinicalGPT significantly outperforms other models in these tasks, highlighting the effectiveness of our approach in adapting large language models to the critical domain of healthcare.
翻译:大语言模型通过预训练和指令微调等技术,在各类自然语言处理任务中展现出卓越性能。然而,由于存在事实准确性不足、推理能力欠缺以及缺乏真实世界经验支撑等挑战,这些模型在医学应用中的有效性仍受到限制。本研究提出ClinicalGPT——一个专为临床场景设计与优化的语言模型。通过将广泛多样的真实世界数据(如医疗记录、领域专业知识及多轮对话咨询)融入训练过程,ClinicalGPT能够更充分地应对多种临床任务。此外,我们引入了一套综合评估框架,涵盖医学知识问答、医学考试、患者咨询及医疗记录诊断分析。实验结果表明,ClinicalGPT在这些任务中显著优于其他模型,凸显了本方法在将大语言模型适配至医疗关键领域方面的有效性。