Electronic Health Record (EHR) coding involves automatically classifying EHRs into diagnostic codes. While most previous research treats this as a multi-label classification task, generating probabilities for each code and selecting those above a certain threshold as labels, these approaches often overlook the challenge of identifying complex diseases. In this study, our focus is on detecting complication diseases within EHRs. We propose a novel coarse-to-fine ICD path generation framework called the Copy Recurrent Neural Network Structure Network (CRNNet), which employs a Path Generator (PG) and a Path Discriminator (PD) for EHR coding. By using RNNs to generate sequential outputs and incorporating a copy module, we efficiently identify complication diseases. Our method achieves a 57.30\% ratio of complex diseases in predictions, outperforming state-of-the-art and previous approaches. Additionally, through an ablation study, we demonstrate that the copy mechanism plays a crucial role in detecting complex diseases.
翻译:电子健康记录(EHR)编码涉及自动将EHR分类为诊断代码。尽管以往多数研究将其视为多标签分类任务,即为每个代码生成概率并选择超过特定阈值的代码作为标签,但这些方法往往忽略了识别复杂疾病的挑战。本研究中,我们专注于检测EHR中的并发症疾病。我们提出了一种新颖的从粗到细的ICD路径生成框架,称为复制循环神经网络结构网络(CRNNet),该框架采用路径生成器(PG)和路径判别器(PD)进行EHR编码。通过使用RNN生成序列输出并结合复制模块,我们高效地识别了并发症疾病。我们的方法在预测中实现了57.30%的复杂疾病比例,超越了现有最优方法和以往方法。此外,通过消融研究,我们证明了复制机制在检测复杂疾病中起到了关键作用。