This paper introduces a system that integrates large language models (LLMs) into the clinical trial retrieval process, enhancing the effectiveness of matching patients with eligible trials while maintaining information privacy and allowing expert oversight. We evaluate six LLMs for query generation, focusing on open-source and relatively small models that require minimal computational resources. Our evaluation includes two closed-source and four open-source models, with one specifically trained in the medical field and five general-purpose models. We compare the retrieval effectiveness achieved by LLM-generated queries against those created by medical experts and state-of-the-art methods from the literature. Our findings indicate that the evaluated models reach retrieval effectiveness on par with or greater than expert-created queries. The LLMs consistently outperform standard baselines and other approaches in the literature. The best performing LLMs exhibit fast response times, ranging from 1.7 to 8 seconds, and generate a manageable number of query terms (15-63 on average), making them suitable for practical implementation. Our overall findings suggest that leveraging small, open-source LLMs for clinical trials retrieval can balance performance, computational efficiency, and real-world applicability in medical settings.
翻译:本文介绍了一种将大型语言模型(LLMs)整合到临床试验检索流程中的系统,该系统在保持信息隐私并允许专家监督的同时,提升了患者与合格试验的匹配效率。我们评估了六种用于查询生成的LLMs,重点关注计算资源需求较低的开源及相对小型模型。评估对象包括两种闭源模型和四种开源模型,其中一种为医学领域专门训练,其余五种为通用模型。我们将LLM生成的查询与医学专家创建的查询以及文献中的前沿方法进行了检索效果比较。研究结果表明,所评估模型达到的检索效果与专家创建的查询相当或更优。LLMs在各项评估中均持续优于标准基线及文献中的其他方法。表现最佳的LLMs响应速度较快(1.7至8秒),生成的查询项数量适中(平均15-63项),具备实际应用可行性。整体研究结果表明,在临床试验检索中采用小型开源LLMs,能够在医疗场景中实现性能、计算效率与现实适用性的平衡。