This paper presents ClinicSum, a novel framework designed to automatically generate clinical summaries from patient-doctor conversations. It utilizes a two-module architecture: a retrieval-based filtering module that extracts Subjective, Objective, Assessment, and Plan (SOAP) information from conversation transcripts, and an inference module powered by fine-tuned Pre-trained Language Models (PLMs), which leverage the extracted SOAP data to generate abstracted clinical summaries. To fine-tune the PLM, we created a training dataset of consisting 1,473 conversations-summaries pair by consolidating two publicly available datasets, FigShare and MTS-Dialog, with ground truth summaries validated by Subject Matter Experts (SMEs). ClinicSum's effectiveness is evaluated through both automatic metrics (e.g., ROUGE, BERTScore) and expert human assessments. Results show that ClinicSum outperforms state-of-the-art PLMs, demonstrating superior precision, recall, and F-1 scores in automatic evaluations and receiving high preference from SMEs in human assessment, making it a robust solution for automated clinical summarization.
翻译:本文提出了ClinicSum,一种旨在从医患对话中自动生成临床摘要的新型框架。该框架采用双模块架构:一个基于检索的过滤模块,用于从对话文本中提取主观信息、客观信息、评估与计划(SOAP)信息;以及一个由微调的预训练语言模型(PLMs)驱动的推理模块,该模块利用提取的SOAP数据生成抽象的临床摘要。为了微调PLM,我们整合了两个公开数据集FigShare和MTS-Dialog,创建了一个包含1,473个对话-摘要对的训练数据集,其真实摘要由领域专家(SMEs)验证。ClinicSum的有效性通过自动指标(如ROUGE、BERTScore)和专家人工评估进行了验证。结果表明,ClinicSum在自动评估中优于当前最先进的PLMs,展现出更高的精确率、召回率和F-1分数,并在人工评估中获得领域专家的高度青睐,这使其成为自动临床摘要生成的一个稳健解决方案。