Leveraging Large Language Models for Medical Information Extraction and Query Generation

This paper introduces a system that integrates large language models (LLMs) into the clinical trial retrieval process, enhancing the effectiveness of matching patients with eligible trials while maintaining information privacy and allowing expert oversight. We evaluate six LLMs for query generation, focusing on open-source and relatively small models that require minimal computational resources. Our evaluation includes two closed-source and four open-source models, with one specifically trained in the medical field and five general-purpose models. We compare the retrieval effectiveness achieved by LLM-generated queries against those created by medical experts and state-of-the-art methods from the literature. Our findings indicate that the evaluated models reach retrieval effectiveness on par with or greater than expert-created queries. The LLMs consistently outperform standard baselines and other approaches in the literature. The best performing LLMs exhibit fast response times, ranging from 1.7 to 8 seconds, and generate a manageable number of query terms (15-63 on average), making them suitable for practical implementation. Our overall findings suggest that leveraging small, open-source LLMs for clinical trials retrieval can balance performance, computational efficiency, and real-world applicability in medical settings.

翻译：本文介绍了一种将大型语言模型（LLMs）整合到临床试验检索流程中的系统，该系统在保持信息隐私并允许专家监督的同时，提升了患者与合格试验的匹配效率。我们评估了六种用于查询生成的LLMs，重点关注计算资源需求较低的开源及相对小型模型。评估对象包括两种闭源模型和四种开源模型，其中一种为医学领域专门训练，其余五种为通用模型。我们将LLM生成的查询与医学专家创建的查询以及文献中的前沿方法进行了检索效果比较。研究结果表明，所评估模型达到的检索效果与专家创建的查询相当或更优。LLMs在各项评估中均持续优于标准基线及文献中的其他方法。表现最佳的LLMs响应速度较快（1.7至8秒），生成的查询项数量适中（平均15-63项），具备实际应用可行性。整体研究结果表明，在临床试验检索中采用小型开源LLMs，能够在医疗场景中实现性能、计算效率与现实适用性的平衡。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/