Fast and Interpretable Mortality Risk Scores for Critical Care Patients

Prediction of mortality in intensive care unit (ICU) patients is an important task in critical care medicine. Prior work in creating mortality risk models falls into two major categories: domain-expert-created scoring systems, and black box machine learning (ML) models. Both of these have disadvantages: black box models are unacceptable for use in hospitals, whereas manual creation of models (including hand-tuning of logistic regression parameters) relies on humans to perform high-dimensional constrained optimization, which leads to a loss in performance. In this work, we bridge the gap between accurate black box models and hand-tuned interpretable models. We build on modern interpretable ML techniques to design accurate and interpretable mortality risk scores. We leverage the largest existing public ICU monitoring datasets, namely the MIMIC III and eICU datasets. By evaluating risk across medical centers, we are able to study generalization across domains. In order to customize our risk score models, we develop a new algorithm, GroupFasterRisk, which has several important benefits: (1) it uses hard sparsity constraint, allowing users to directly control the number of features; (2) it incorporates group sparsity to allow more cohesive models; (3) it allows for monotonicity correction on models for including domain knowledge; (4) it produces many equally-good models at once, which allows domain experts to choose among them. GroupFasterRisk creates its risk scores within hours, even on the large datasets we study here. GroupFasterRisk's risk scores perform better than risk scores currently used in hospitals, and have similar prediction performance to black box ML models (despite being much sparser). Because GroupFasterRisk produces a variety of risk scores and handles constraints, it allows design flexibility, which is the key enabler of practical and trustworthy model creation.

翻译：预测重症监护病房（ICU）患者的死亡率是危重症医学领域的一项重要任务。既往构建死亡风险模型的工作主要分为两类：领域专家构建的评分系统和黑盒机器学习模型。这两类方法均存在缺陷：黑盒模型在医院环境中难以被接受，而人工构建模型（包括对逻辑回归参数的手工调优）依赖人类进行高维约束优化，导致性能损失。本研究旨在弥合高精度黑盒模型与手工调优可解释模型之间的鸿沟。我们基于现代可解释机器学习技术，设计了兼具准确性与可解释性的死亡风险评分系统。通过利用现有最大的公开ICU监测数据集（即MIMIC III和eICU数据集），并跨医疗中心评估风险，我们得以研究模型在不同领域的泛化能力。为定制化风险评分模型，我们开发了新算法GroupFasterRisk，该算法具有以下重要优势：（1）采用硬稀疏约束，允许用户直接控制特征数量；（2）整合组稀疏性以实现更具凝聚力的模型；（3）支持单调性校正以融入领域知识；（4）可同时生成多个性能相近的模型，供领域专家选择。即使在我们研究的大规模数据集上，GroupFasterRisk也能在数小时内完成风险评分构建。其生成的风险评分不仅优于医院当前使用的评分系统，且（在稀疏性显著更高的条件下）预测性能与黑盒机器学习模型相当。由于GroupFasterRisk能生成多样化的风险评分并处理约束条件，它为模型设计提供了灵活性，这是实现实用且可信赖模型构建的关键要素。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日