Objectives: Our objective is to create an end-to-end system called AutoRD, which automates extracting information from clinical text about rare diseases. We have conducted various tests to evaluate the performance of AutoRD and highlighted its strengths and limitations in this paper. Materials and Methods: Our system, AutoRD, is a software pipeline involving data preprocessing, entity extraction, relation extraction, entity calibration, and knowledge graph construction. We implement this using large language models and medical knowledge graphs developed from open-source medical ontologies. We quantitatively evaluate our system on entity extraction, relation extraction, and the performance of knowledge graph construction. Results: AutoRD achieves an overall F1 score of 47.3%, a 14.4% improvement compared to the base LLM. In detail, AutoRD achieves an overall entity extraction F1 score of 56.1% (rare_disease: 83.5%, disease: 35.8%, symptom_and_sign: 46.1%, anaphor: 67.5%) and an overall relation extraction F1 score of 38.6% (produces: 34.7%, increases_risk_of: 12.4%, is_a: 37.4%, is_acronym: 44.1%, is_synonym: 16.3%, anaphora: 57.5%). Our qualitative experiment also demonstrates that the performance in constructing the knowledge graph is commendable. Discussion: AutoRD demonstrates the potential of LLM applications in rare disease detection. This improvement is attributed to several design, including the integration of ontologies-enhanced LLMs. Conclusion: AutoRD is an automated end-to-end system for extracting rare disease information from text to build knowledge graphs. It uses ontologies-enhanced LLMs for a robust medical knowledge base. The superior performance of AutoRD is validated by experimental evaluations, demonstrating the potential of LLMs in healthcare.
翻译:摘要:目的:我们的目标是开发一个名为AutoRD的端到端系统,用于自动化从临床文本中提取罕见病相关信息。本文通过多项测试评估AutoRD的性能,并阐述其优势与局限性。材料与方法:AutoRD系统是一个包含数据预处理、实体抽取、关系抽取、实体校准及知识图谱构建的软件流水线。我们利用大语言模型及基于开源医学本体开发的医学知识图谱实现该系统,并通过实体抽取、关系抽取及知识图谱构建性能对系统进行定量评估。结果:AutoRD整体F1得分为47.3%,较基础大语言模型提升14.4%。具体而言,实体抽取总体F1得分为56.1%(罕见病:83.5%,疾病:35.8%,症状与体征:46.1%,回指:67.5%),关系抽取总体F1得分为38.6%(产生:34.7%,增加风险:12.4%,属于:37.4%,缩写:44.1%,同义:16.3%,回指关系:57.5%)。定性实验表明,该系统在知识图谱构建方面表现优异。讨论:AutoRD展示了大语言模型在罕见病检测领域的应用潜力。其性能提升归因于多项设计,包括整合本体增强型大语言模型。结论:AutoRD是一个自动化端到端系统,能够从文本中提取罕见病信息以构建知识图谱,并通过本体增强型大语言模型实现稳健的医学知识库。实验评估验证了AutoRD的优越性能,彰显了大语言模型在医疗健康领域的潜力。