In the rapidly evolving field of healthcare and beyond, the integration of generative AI in Electronic Health Records (EHRs) represents a pivotal advancement, addressing a critical gap in current information extraction techniques. This paper introduces GAMedX, a Named Entity Recognition (NER) approach utilizing Large Language Models (LLMs) to efficiently extract entities from medical narratives and unstructured text generated throughout various phases of the patient hospital visit. By addressing the significant challenge of processing unstructured medical text, GAMedX leverages the capabilities of generative AI and LLMs for improved data extraction. Employing a unified approach, the methodology integrates open-source LLMs for NER, utilizing chained prompts and Pydantic schemas for structured output to navigate the complexities of specialized medical jargon. The findings reveal significant ROUGE F1 score on one of the evaluation datasets with an accuracy of 98\%. This innovation enhances entity extraction, offering a scalable, cost-effective solution for automated forms filling from unstructured data. As a result, GAMedX streamlines the processing of unstructured narratives, and sets a new standard in NER applications, contributing significantly to theoretical and practical advancements beyond the medical technology sphere.
翻译:在医疗健康及其他领域快速发展的背景下,生成式人工智能与电子健康记录的整合代表着一项关键进展,它弥补了当前信息提取技术中的一个重要空白。本文介绍了GAMedX,一种利用大型语言模型的命名实体识别方法,旨在高效地从患者医院就诊各阶段产生的医疗叙述和非结构化文本中提取实体。通过应对处理非结构化医疗文本这一重大挑战,GAMedX利用生成式人工智能和LLM的能力来改进数据提取。该方法采用统一框架,整合开源LLM进行NER,利用链式提示和Pydantic模式生成结构化输出,以应对专业医学术语的复杂性。研究结果显示,在其中一个评估数据集上取得了显著的ROUGE F1分数,准确率达到98%。这一创新增强了实体提取能力,为从非结构化数据自动填充表格提供了一种可扩展、成本效益高的解决方案。因此,GAMedX简化了非结构化叙述的处理流程,为NER应用设立了新标准,对医学技术领域之外的理论与实践进步做出了重要贡献。