Medical progress notes play a crucial role in documenting a patient's hospital journey, including his or her condition, treatment plan, and any updates for healthcare providers. Automatic summarisation of a patient's problems in the form of a problem list can aid stakeholders in understanding a patient's condition, reducing workload and cognitive bias. BioNLP 2023 Shared Task 1A focuses on generating a list of diagnoses and problems from the provider's progress notes during hospitalisation. In this paper, we introduce our proposed approach to this task, which integrates two complementary components. One component employs large language models (LLMs) for data augmentation; the other is an abstractive summarisation LLM with a novel pre-training objective for generating the patients' problems summarised as a list. Our approach was ranked second among all submissions to the shared task. The performance of our model on the development and test datasets shows that our approach is more robust on unknown data, with an improvement of up to 3.1 points over the same size of the larger model.
翻译:医疗进展记录在记载患者住院历程中具有关键作用,包括其病情状况、诊疗方案及为医护人员提供的实时更新。将患者问题自动摘要为问题列表形式,有助于医疗相关方掌握患者病情,降低工作负担与认知偏差。BioNLP 2023共享任务1A重点关注从住院期间医护人员的病程记录中生成诊断与问题清单。本文提出了一种融合两个互补组件的解决方案:其一采用大语言模型进行数据增强;其二为基于新型预训练目标的抽象式摘要大语言模型,用于生成以列表形式呈现的患者问题摘要。本方法在该共享任务的所有提交方案中位列第二。在开发集与测试集上的实验结果表明,本模型对未知数据具有更强鲁棒性,相较于同等规模的更大模型性能提升达3.1个百分点。