MedDiT: A Knowledge-Controlled Diffusion Transformer Framework for Dynamic Medical Image Generation in Virtual Simulated Patient

Medical education relies heavily on Simulated Patients (SPs) to provide a safe environment for students to practice clinical skills, including medical image analysis. However, the high cost of recruiting qualified SPs and the lack of diverse medical imaging datasets have presented significant challenges. To address these issues, this paper introduces MedDiT, a novel knowledge-controlled conversational framework that can dynamically generate plausible medical images aligned with simulated patient symptoms, enabling diverse diagnostic skill training. Specifically, MedDiT integrates various patient Knowledge Graphs (KGs), which describe the attributes and symptoms of patients, to dynamically prompt Large Language Models' (LLMs) behavior and control the patient characteristics, mitigating hallucination during medical conversation. Additionally, a well-tuned Diffusion Transformer (DiT) model is incorporated to generate medical images according to the specified patient attributes in the KG. In this paper, we present the capabilities of MedDiT through a practical demonstration, showcasing its ability to act in diverse simulated patient cases and generate the corresponding medical images. This can provide an abundant and interactive learning experience for students, advancing medical education by offering an immersive simulation platform for future healthcare professionals. The work sheds light on the feasibility of incorporating advanced technologies like LLM, KG, and DiT in education applications, highlighting their potential to address the challenges faced in simulated patient-based medical education.

翻译：医学教育高度依赖模拟患者（SPs）为学生提供安全的临床技能（包括医学图像分析）练习环境。然而，招募合格模拟患者的高昂成本以及多样化医学影像数据集的缺乏，构成了重大挑战。为应对这些问题，本文提出MedDiT——一种新颖的知识控制对话框架，能够动态生成与模拟患者症状相符的合理医学图像，从而实现多样化的诊断技能训练。具体而言，MedDiT整合了描述患者属性与症状的多种患者知识图谱（KGs），动态引导大语言模型（LLMs）的行为并控制患者特征，从而减少医学对话中的幻觉现象。此外，框架引入了经精细调优的扩散Transformer（DiT）模型，以根据知识图谱中指定的患者属性生成医学图像。本文通过实际演示展现了MedDiT的能力，展示了其在多样化模拟病例中的表现及对应医学图像的生成效果。该系统可为学生提供丰富且互动的学习体验，通过为未来医疗专业人员提供沉浸式模拟平台来推动医学教育发展。本工作揭示了将LLM、KG和DiT等先进技术融入教育应用的可行性，凸显了其在应对基于模拟患者的医学教育挑战方面的潜力。