Deep-learning-based clinical decision support using structured electronic health records (EHR) has been an active research area for predicting risks of mortality and diseases. Meanwhile, large amounts of narrative clinical notes provide complementary information, but are often not integrated into predictive models. In this paper, we provide a novel multimodal transformer to fuse clinical notes and structured EHR data for better prediction of in-hospital mortality. To improve interpretability, we propose an integrated gradients (IG) method to select important words in clinical notes and discover the critical structured EHR features with Shapley values. These important words and clinical features are visualized to assist with interpretation of the prediction outcomes. We also investigate the significance of domain adaptive pretraining and task adaptive fine-tuning on the Clinical BERT, which is used to learn the representations of clinical notes. Experiments demonstrated that our model outperforms other methods (AUCPR: 0.538, AUCROC: 0.877, F1:0.490).
翻译:基于结构化电子健康记录(EHR)的深度学习临床决策支持已成为预测死亡风险和疾病风险的热门研究领域。与此同时,大量叙述性临床笔记提供了互补信息,但往往未被整合到预测模型中。本文提出一种新颖的多模态Transformer,用于融合临床笔记和结构化EHR数据,以更好地预测院内死亡率。为提高可解释性,我们提出一种集成梯度(IG)方法,用于选择临床笔记中的重要词汇,并利用沙普利值发现关键的结构化EHR特征。这些重要词汇和临床特征被可视化,以辅助解释预测结果。我们还研究了领域自适应预训练和任务自适应微调对Clinical BERT(用于学习临床笔记表示)的重要性。实验表明,我们的模型优于其他方法(AUCPR: 0.538, AUCROC: 0.877, F1:0.490)。