The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.
翻译:从临床和医学影像报告中提取肺部病灶信息对于肺部疾病的研究和临床诊疗至关重要。大语言模型(LLMs)能有效解读报告中的非结构化文本,但由于缺乏领域专业知识,常产生事实性错误,导致准确性下降,为其在临床环境中的应用带来挑战。为解决此问题,我们提出一种新颖框架,通过上下文学习(ICL)将生成的内部知识与外部知识对齐。该框架采用检索器识别相关的内部或外部知识单元,并利用评估器对检索到的内部知识规则的真实性和有效性进行评分,从而实现知识库的对齐与更新。在专家标注的测试数据集上的实验表明,相较于现有ICL方法,该框架在关键字段(病灶大小、边缘和实性度)上的F1分数平均提升了12.9%。