Electronic health records (EHRs), which contain patients' medical histories, tend to be written in freely formatted (unstructured) text because they are complicated by their nature. Quickly understanding a patient's history is challenging and critical because writing styles vary among doctors, which may even cause clinical incidents. This paper proposes a Health Record Timeliner system (HeaRT), which visualises patients' clinical histories directly from natural language text in EHRs. Unlike only a few previous attempts, our system achieved feasible and practical performance for the first time, by integrating a state-of-the-art language model that recognises clinical entities (e.g. diseases, medicines, and time expressions) and their temporal relations from the raw text in EHRs and radiology reports. By chronologically aligning the clinical entities to the clinical events extracted from a medical report, this web-based system visualises them in a Gantt chart-like format. Our novel evaluation method showed that the proposed system successfully generated coherent timelines from the two sets of radiology reports describing the same CT scan but written by different radiologists. Real-world assessments are planned to improve the remaining issues.
翻译:电子健康记录(EHR)包含患者的病史信息,但由于其固有复杂性,通常以自由格式(非结构化)文本撰写。快速理解患者病史极具挑战性且至关重要,因为不同医生的写作风格各异,甚至可能引发临床事故。本文提出健康记录时间线系统(HeaRT),该系统可直接从EHR自然语言文本中可视化患者的临床病史。与以往少量尝试不同,本系统首次通过整合最先进的语言模型实现了可行且实用的性能,该模型可从EHR和放射报告中识别临床实体(如疾病、药物及时间表达式)及其时序关系。通过将临床实体按时间顺序与医疗报告中的临床事件对齐,该网页端系统以类似甘特图的格式进行可视化。我们提出的新型评估方法表明,该系统能成功从两份描述同一CT扫描但由不同放射科医生撰写的放射报告中生成连贯时间线。后续将开展真实环境评估以改进现存问题。