The development of large language models (LLMs) has brought unprecedented possibilities for artificial intelligence (AI) based medical diagnosis. However, the application perspective of LLMs in real diagnostic scenarios is still unclear because they are not adept at collecting patient data proactively. This study presents a LLM-based diagnostic system that enhances planning capabilities by emulating doctors. Our system involves two external planners to handle planning tasks. The first planner employs a reinforcement learning approach to formulate disease screening questions and conduct initial diagnoses. The second planner uses LLMs to parse medical guidelines and conduct differential diagnoses. By utilizing real patient electronic medical record data, we constructed simulated dialogues between virtual patients and doctors and evaluated the diagnostic abilities of our system. We demonstrate that our system significantly surpasses existing models, including GPT-4 Turbo, in both disease screening and differential diagnoses. This research represents a step towards more seamlessly integrating AI into clinical settings, potentially enhancing the accuracy and accessibility of medical diagnostics.
翻译:大语言模型(LLMs)的发展为基于人工智能(AI)的医疗诊断带来了前所未有的可能性。然而,由于LLMs不擅长主动收集患者数据,其在真实诊断场景中的应用前景尚不明确。本研究提出了一种基于LLM的诊断系统,通过模拟医生来增强规划能力。我们的系统包含两个外部规划器来处理规划任务。第一个规划器采用强化学习方法制定疾病筛查问题并进行初步诊断。第二个规划器利用LLMs解析医疗指南并开展鉴别诊断。通过利用真实患者电子病历数据,我们构建了虚拟患者与医生之间的模拟对话,并评估了系统的诊断能力。我们证明,该系统在疾病筛查和鉴别诊断方面显著优于包括GPT-4 Turbo在内的现有模型。这项研究标志着向临床环境中更无缝整合AI迈出了一步,有望提升医疗诊断的准确性和可及性。