The application of Artificial Intelligence (AI) in healthcare has been revolutionary, especially with the recent advancements in transformer-based Large Language Models (LLMs). However, the task of understanding unstructured electronic medical records remains a challenge given the nature of the records (e.g., disorganization, inconsistency, and redundancy) and the inability of LLMs to derive reasoning paradigms that allow for comprehensive understanding of medical variables. In this work, we examine the power of coupling symbolic reasoning with language modeling toward improved understanding of unstructured clinical texts. We show that such a combination improves the extraction of several medical variables from unstructured records. In addition, we show that the state-of-the-art commercially-free LLMs enjoy retrieval capabilities comparable to those provided by their commercial counterparts. Finally, we elaborate on the need for LLM steering through the application of symbolic reasoning as the exclusive use of LLMs results in the lowest performance.
翻译:人工智能(AI)在医疗领域的应用已带来革命性变革,尤其是基于Transformer的大型语言模型(LLMs)的最新进展。然而,鉴于医疗记录的非结构化特性(如组织混乱、不一致和冗余)以及LLMs难以推导出能全面理解医疗变量的推理范式,理解非结构化电子病历仍是一项挑战。本研究探讨了将符号推理与语言建模相结合以提升非结构化临床文本理解的效能。实验表明,这种组合方法能显著改善从非结构化记录中提取多种医疗变量的效果。此外,我们发现当前最先进的开源LLMs展现了与商业版LLMs相当的检索能力。最后,我们重点论证了通过符号推理引导LLM的必要性——因为单纯依赖LLMs会导致性能最低。