Automatic text summarization (ATS) is an emerging technology to assist clinicians in providing continuous and coordinated care. This study presents an approach to summarize doctor-patient dialogues using generative large language models (LLMs). We developed prompt-tuning algorithms to instruct generative LLMs to summarize clinical text. We examined the prompt-tuning strategies, the size of soft prompts, and the few-short learning ability of GatorTronGPT, a generative clinical LLM developed using 277 billion clinical and general English words with up to 20 billion parameters. We compared GatorTronGPT with a previous solution based on fine-tuning of a widely used T5 model, using a clinical benchmark dataset MTS-DIALOG. The experimental results show that the GatorTronGPT- 20B model achieved the best performance on all evaluation metrics. The proposed solution has a low computing cost as the LLM parameters are not updated during prompt-tuning. This study demonstrates the efficiency of generative clinical LLMs for clinical ATS through prompt tuning.
翻译:自动文本摘要(ATS)是一种新兴技术,可协助临床医生提供连续且协调的医疗服务。本研究提出了一种利用生成式大语言模型(LLMs)对医患对话进行摘要的方法。我们开发了提示调优算法,用于指导生成式大语言模型对临床文本进行摘要。我们探究了GatorTronGPT的提示调优策略、软提示规模与小样本学习能力——该生成式临床大语言模型基于2770亿临床及通用英语词汇训练,参数规模高达200亿。我们以临床基准数据集MTS-DIALOG为测试集,将GatorTronGPT与基于广泛使用的T5模型微调的既有方案进行了比较。实验结果表明,GatorTronGPT-20B模型在所有评估指标上均取得最佳性能。由于提示调优过程中不更新大语言模型参数,本方案计算成本较低。该研究验证了通过提示调优将生成式临床大语言模型应用于临床自动文本摘要的高效性。