Automatic text summarization (ATS) is an emerging technology to assist clinicians in providing continuous and coordinated care. This study presents an approach to summarize doctor-patient dialogues using generative large language models (LLMs). We developed prompt-tuning algorithms to instruct generative LLMs to summarize clinical text. We examined the prompt-tuning strategies, the size of soft prompts, and the few-short learning ability of GatorTronGPT, a generative clinical LLM developed using 277 billion clinical and general English words with up to 20 billion parameters. We compared GatorTronGPT with a previous solution based on fine-tuning of a widely used T5 model, using a clinical benchmark dataset MTS-DIALOG. The experimental results show that the GatorTronGPT- 20B model achieved the best performance on all evaluation metrics. The proposed solution has a low computing cost as the LLM parameters are not updated during prompt-tuning. This study demonstrates the efficiency of generative clinical LLMs for clinical ATS through prompt tuning.
翻译:自动文本摘要(ATS)是一项新兴技术,可辅助临床医生提供持续且协调的医疗服务。本研究提出了一种利用生成式大语言模型(LLMs)对医患对话进行摘要的方法。我们开发了提示微调算法,用于指导生成式LLMs对临床文本进行摘要。研究考察了提示微调策略、软提示规模,以及GatorTronGPT(一种基于2770亿临床及通用英语语料训练、参数规模高达200亿的生成式临床LLM)的少样本学习能力。我们使用临床基准数据集MTS-DIALOG,将GatorTronGPT与基于广泛使用的T5模型微调的先前方案进行了对比。实验结果表明,GatorTronGPT-20B模型在所有评估指标上均取得了最优性能。所提方案的计算机成本较低,因为提示微调过程中不更新LLM参数。本研究证明了通过提示微调,生成式临床LLM在临床ATS任务中的高效性。