The development of large language models (LLM) have brought unprecedented possibilities for artificial intelligence (AI) based medical diagnosis. However, the application perspective of LLMs in real diagnosis scenarios is still unclear because they are not adept at collecting patient data proactively. This study presents a novel approach that implemented AI systems to emulate the two-phase process used by physicians during medical consultations. Our methodology involves two specialized planners: the first employs a data-driven, reinforcement learning approach to formulate disease screening questions; the second uses LLMs to parse medical guidelines and conducts differential diagnosis. By utilizing real patient electronic medical records (EMR) data, we constructed simulated dialogues between virtual patients and doctors and evaluate the diagnostic abilities of our system. We demonstrate that our system surpasses existing models, including GPT-4 Turbo, in both disease screening and differential diagnosis. This research represents a step towards integrating AI more seamlessly into clinical settings, potentially improving the accuracy and accessibility of medical diagnostics.
翻译:大型语言模型(LLM)的发展为基于人工智能(AI)的医疗诊断带来了前所未有的可能性。然而,由于LLM不擅长主动收集患者数据,其在真实诊断场景中的应用前景仍不明确。本研究提出了一种新方法,通过实现AI系统来模拟医生在医疗咨询中使用的两阶段过程。我们的方法涉及两个专门的规划器:第一个采用数据驱动的强化学习方法来制定疾病筛查问题;第二个利用LLM解析医疗指南并进行鉴别诊断。通过利用真实的患者电子病历(EMR)数据,我们构建了虚拟患者与医生之间的模拟对话,并评估了我们系统的诊断能力。我们证明,在疾病筛查和鉴别诊断两方面,我们的系统均优于现有模型(包括GPT-4 Turbo)。这项研究代表着向更无缝地将AI整合到临床环境迈出的一步,有望提高医疗诊断的准确性和可及性。