Medical Large Language Models (LLMs) have demonstrated impressive performance on a wide variety of medical NLP tasks; however, there still lacks a LLM specifically designed for phenotyping identification and diagnosis in cancer domain. Moreover, these LLMs typically have several billions of parameters, making them computationally expensive for healthcare systems. Thus, in this study, we propose CancerLLM, a model with 7 billion parameters and a Mistral-style architecture, pre-trained on nearly 2.7M clinical notes and over 515K pathology reports covering 17 cancer types, followed by fine-tuning on two cancer-relevant tasks, including cancer phenotypes extraction and cancer diagnosis generation. Our evaluation demonstrated that the CancerLLM achieves state-of-the-art results with F1 score of 91.78% on phenotyping extraction and 86.81% on disganois generation. It outperformed existing LLMs, with an average F1 score improvement of 9.23%. Additionally, the CancerLLM demonstrated its efficiency on time and GPU usage, and robustness comparing with other LLMs. We demonstrated that CancerLLM can potentially provide an effective and robust solution to advance clinical research and practice in cancer domain
翻译:医学大型语言模型(LLM)已在多种医学自然语言处理任务中展现出卓越性能,但目前仍缺乏专门针对癌症领域表型识别与诊断设计的LLM。此外,现有模型通常具有数十亿参数,对医疗系统造成高昂计算成本。为此,本研究提出CancerLLM——一个基于Mistral架构、拥有70亿参数的模型。该模型在覆盖17种癌症类型的近270万份临床记录与51.5万份病理报告上进行预训练,并针对癌症表型提取与诊断生成两项任务进行微调。评估结果表明,CancerLLM在表型提取任务中取得91.78%的F1分数,在诊断生成任务中获得86.81%的F1分数,均达到最先进水平,相较现有LLM平均F1分数提升9.23%。此外,该模型在时间效率、GPU资源占用及鲁棒性方面均优于其他LLM。我们证明CancerLLM有望为推进癌症领域临床研究与实践提供高效稳健的解决方案。