We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference rephrasing. These datasets are instrumental in training DISC-MedLLM, surpassing existing medical LLMs in both single-turn and multi-turn consultation scenarios. Extensive experimental results demonstrate the effectiveness of the proposed model in bridging the gap between general language models and real-world medical consultation. Additionally, we release the constructed dataset and model weights to further contribute to research and development. Further details and resources can be found at https://github.com/FudanDISC/DISC-MedLLM
翻译:我们提出DISC-MedLLM,这是一项利用大型语言模型(LLMs)在端到端对话式医疗服务中提供准确且真实医疗回应的综合解决方案。为构建高质量的监督微调(SFT)数据集,我们采用三种策略:利用医疗知识图谱、重构真实对话记录,以及引入人工引导的偏好改写。这些数据集对于训练DISC-MedLLM至关重要,使其在单轮和多轮咨询场景中均超越现有医疗LLMs。大量实验结果表明,该模型在弥合通用语言模型与真实医疗咨询之间的差距方面具有有效性。此外,我们开源了构建的数据集与模型权重,以进一步促进研究与开发工作。更多详情与资源请访问https://github.com/FudanDISC/DISC-MedLLM