Prior Authorization delivers safe, appropriate, and cost-effective care that is medically justified with evidence-based guidelines. However, the process often requires labor-intensive manual comparisons between patient medical records and clinical guidelines, that is both repetitive and time-consuming. Recent developments in Large Language Models (LLMs) have shown potential in addressing complex medical NLP tasks with minimal supervision. This paper explores the application of Multi-Agent System (MAS) that utilize specialized LLM agents to automate Prior Authorization task by breaking them down into simpler and manageable sub-tasks. Our study systematically investigates the effects of various prompting strategies on these agents and benchmarks the performance of different LLMs. We demonstrate that GPT-4 achieves an accuracy of 86.2% in predicting checklist item-level judgments with evidence, and 95.6% in determining overall checklist judgment. Additionally, we explore how these agents can contribute to explainability of steps taken in the process, thereby enhancing trust and transparency in the system.
翻译:预先授权通过循证指南提供安全、适当、经济有效且具有医学合理性的医疗服务。然而,该流程通常需要将患者医疗记录与临床指南进行劳动密集型的人工比对,这一过程既重复又耗时。近期大语言模型(LLMs)的发展已显示出在最低监督下处理复杂医学自然语言处理任务的潜力。本文探索了多智能体系统(MAS)的应用,该系统利用专用LLM智能体,通过将预先授权任务分解为更简单、可管理的子任务来实现自动化。我们的研究系统性地考察了不同提示策略对这些智能体的影响,并对不同LLM的性能进行了基准测试。实验表明,GPT-4在提供证据的检查表项目级判断预测中达到86.2%的准确率,在整体检查表判断决策中达到95.6%的准确率。此外,我们探讨了这些智能体如何增强流程步骤的可解释性,从而提升系统的可信度与透明度。