Standardized patients (SPs) are indispensable for clinical skills training but remain expensive and difficult to scale. Although large language model (LLM)-based virtual standardized patients (VSPs) have been proposed as an alternative, their behavior remains unstable and lacks rigorous comparison with human standardized patients. We propose EasyMED, a multi-agent VSP framework that separates case-grounded information disclosure from response generation to support stable, inquiry-conditioned patient behavior. We also introduce SPBench, a human-grounded benchmark with eight expert-defined criteria for interaction-level evaluation. Experiments show that EasyMED more closely matches human SP behavior than existing VSPs, particularly in case consistency and controlled disclosure. A four-week controlled study further demonstrates learning outcomes comparable to human SP training, with stronger early gains for novice learners and improved flexibility, psychological safety, and cost efficiency.
翻译:标准化病人在临床技能培训中不可或缺,但成本高昂且难以规模化。尽管基于大语言模型的虚拟标准化病人已被提出作为替代方案,但其行为仍不稳定,且缺乏与人类标准化病人的严格比较。我们提出EasyMED——一种多智能体虚拟标准化病人框架,该框架将基于病例的信息披露与应答生成相分离,以支持稳定且基于问诊条件触发的病人行为。我们还引入了SPBench,这是一个基于人类表现的基准测试,包含八个专家定义的交互层面评估标准。实验表明,与现有虚拟标准化病人相比,EasyMED的行为更接近人类标准化病人,尤其在病例一致性和可控信息披露方面。一项为期四周的对照研究进一步表明,其训练效果与人类标准化病人培训相当,对新手学员早期学习效果提升更显著,并具备更强的灵活性、心理安全性和成本效益。