Hospital administration departments handle a wide range of operational tasks and, in large hospitals, process over 10,000 requests per day, driving growing interest in LLM-based automation. However, prior work has focused primarily on patient--physician interactions or isolated administrative subtasks, failing to capture the complexity of real administrative workflows. To address this gap, we propose H-AdminSim, a comprehensive end-to-end simulation framework that combines realistic data generation with multi-agent-based simulation of hospital administrative workflows. These tasks are quantitatively evaluated using detailed rubrics, enabling systematic comparison of LLMs. Through FHIR integration, H-AdminSim provides a unified and interoperable environment for testing administrative workflows across heterogeneous hospital settings, serving as a standardized testbed for assessing the feasibility and performance of LLM-driven administrative automation.
翻译:医院行政部门处理着广泛的运营任务,在大型医院中,每天需处理超过10,000项请求,这推动了对基于大语言模型(LLM)的自动化日益增长的兴趣。然而,先前的研究主要集中于医患交互或孤立的行政子任务,未能捕捉真实行政工作流程的复杂性。为弥补这一不足,我们提出了H-AdminSim,一个全面的端到端仿真框架,它将真实数据生成与基于多智能体的医院行政工作流程仿真相结合。这些任务通过详细的评估量表进行定量评估,从而实现对不同LLM的系统性比较。通过集成FHIR(快速医疗互操作性资源),H-AdminSim为在异构医院环境中测试行政工作流程提供了一个统一且可互操作的平台,可作为评估LLM驱动的行政自动化可行性与性能的标准化测试床。