Large Language Model (LLM)-based agents that plan, use tools and act has begun to shape healthcare and medicine. Reported studies demonstrate competence on various tasks ranging from EHR analysis and differential diagnosis to treatment planning and research workflows. Yet the literature largely consists of overviews which are either broad surveys or narrow dives into a single capability (e.g., memory, planning, reasoning), leaving healthcare work without a common frame. We address this by reviewing 49 studies using a seven-dimensional taxonomy: Cognitive Capabilities, Knowledge Management, Interaction Patterns, Adaptation & Learning, Safety & Ethics, Framework Typology and Core Tasks & Subtasks with 29 operational sub-dimensions. Using explicit inclusion and exclusion criteria and a labeling rubric (Fully Implemented, Partially Implemented, Not Implemented), we map each study to the taxonomy and report quantitative summaries of capability prevalence and co-occurrence patterns. Our empirical analysis surfaces clear asymmetries. For instance, the External Knowledge Integration sub-dimension under Knowledge Management is commonly realized (~76% Fully Implemented) whereas Event-Triggered Activation sub-dimenison under Interaction Patterns is largely absent (~92% Not Implemented) and Drift Detection & Mitigation sub-dimension under Adaptation & Learning is rare (~98% Not Implemented). Architecturally, Multi-Agent Design sub-dimension under Framework Typology is the dominant pattern (~82% Fully Implemented) while orchestration layers remain mostly partial. Across Core Tasks & Subtasks, information centric capabilities lead e.g., Medical Question Answering & Decision Support and Benchmarking & Simulation, while action and discovery oriented areas such as Treatment Planning & Prescription still show substantial gaps (~59% Not Implemented).
翻译:基于大语言模型(LLM)的智能体能够进行规划、使用工具并执行行动,已开始重塑医疗健康领域。已有研究报告展示了其在电子健康记录分析、鉴别诊断、治疗规划及科研工作流等多种任务上的能力。然而,现有文献主要由综述性研究构成,这些研究要么是宽泛的概览,要么是仅深入探讨单一能力(如记忆、规划、推理),导致医疗健康领域的工作缺乏统一的评估框架。为此,我们通过回顾49项研究,提出了一个七维分类法:认知能力、知识管理、交互模式、适应与学习、安全与伦理、框架类型学、核心任务与子任务,并细分为29个可操作子维度。采用明确的纳入与排除标准以及标注规则(完全实现、部分实现、未实现),我们将每项研究映射到该分类法中,并报告了能力普及度与共现模式的定量总结。我们的实证分析揭示了明显的不对称性。例如,知识管理维度下的外部知识集成子维度普遍实现(约76%为完全实现),而交互模式维度下的事件触发激活子维度则基本缺失(约92%为未实现),适应与学习维度下的漂移检测与缓解子维度则极为罕见(约98%为未实现)。在架构层面,框架类型学维度下的多智能体设计子维度是主导模式(约82%为完全实现),而编排层大多仅部分实现。在核心任务与子任务维度中,以信息为中心的能力领先,例如医疗问答与决策支持以及基准测试与模拟,而以行动和探索为导向的领域,如治疗规划与处方,仍存在显著差距(约59%为未实现)。