Mixed-Integer Projections for Automated Data Correction of EMRs Improve Predictions of Sepsis among Hospitalized Patients

Machine learning (ML) models are increasingly pivotal in automating clinical decisions. Yet, a glaring oversight in prior research has been the lack of proper processing of Electronic Medical Record (EMR) data in the clinical context for errors and outliers. Addressing this oversight, we introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints, generating important meta-data that can be used in ML workflows. In particular, by using high-dimensional mixed-integer programs that capture physiological and biological constraints on patient vitals and lab values, we can harness the power of mathematical "projections" for the EMR data to correct patient data. Consequently, we measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores". These scores provide insight into the patient's health status and significantly boost the performance of ML classifiers in real-life clinical settings. We validate the impact of our framework in the context of early detection of sepsis using ML. We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.

翻译：机器学习模型在自动化临床决策中日益关键。然而，既往研究存在显著疏漏：临床环境中电子病历数据的错误值与异常值缺乏规范处理方法。针对这一问题，我们提出了一种基于投影的创新方法，该方法能将临床专业知识作为领域约束无缝整合，生成可用于机器学习工作流的重要元数据。具体而言，通过构建捕捉患者生命体征与实验室数值生理-生物学约束的高维混合整数规划模型，我们能够利用电子病历数据的数学"投影"能力校正患者数据。进而，我们通过测量校正后数据与定义健康数据范围的约束条件之间的距离，构建了名为"信任分数"的独特预测指标。这些分数可揭示患者健康状况，并在真实临床场景中显著提升机器学习分类器的性能。我们以脓毒症早期检测为应用场景验证了该框架的有效性，结果表明：该方法的AUROC达0.865，精确率0.922，全面超越未采用此类投影技术的传统机器学习模型。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日