During the patient's hospitalization, the physician must record daily observations of the patient and summarize them into a brief document called "discharge summary" when the patient is discharged. Automated generation of discharge summary can greatly relieve the physicians' burden, and has been addressed recently in the research community. Most previous studies of discharge summary generation using the sequence-to-sequence architecture focus on only inpatient notes for input. However, electric health records (EHR) also have rich structured metadata (e.g., hospital, physician, disease, length of stay, etc.) that might be useful. This paper investigates the effectiveness of medical meta-information for summarization tasks. We obtain four types of meta-information from the EHR systems and encode each meta-information into a sequence-to-sequence model. Using Japanese EHRs, meta-information encoded models increased ROUGE-1 by up to 4.45 points and BERTScore by 3.77 points over the vanilla Longformer. Also, we found that the encoded meta-information improves the precisions of its related terms in the outputs. Our results showed the benefit of the use of medical meta-information.
翻译:在患者住院期间,医师必须每日记录对患者的观察,并在患者出院时将其总结为一份简短的文档,称为“出院小结”。自动生成出院小结可大大减轻医师的负担,近期已成为研究界的关注点。以往大多数基于序列到序列架构的出院小结生成研究仅聚焦于住院病历作为输入。然而,电子健康记录(EHR)还包含丰富的结构化元信息(例如,医院、医师、疾病、住院时长等),这些信息可能具有价值。本文研究了医学元信息对摘要任务的有效性。我们从EHR系统中获取四类元信息,并将每种元信息编码到序列到序列模型中。使用日本EHR数据,编码了元信息的模型在ROUGE-1上最高提升了4.45分,在BERTScore上提升了3.77分,优于标准Longformer模型。此外,我们发现编码的元信息提高了输出中相关术语的精确度。我们的结果证明了使用医学元信息的益处。