People Talking and AI Listening: How Stigmatizing Language in EHR Notes Affect AI Performance

Electronic health records (EHRs) serve as an essential data source for the envisioned artificial intelligence (AI)-driven transformation in healthcare. However, clinician biases reflected in EHR notes can lead to AI models inheriting and amplifying these biases, perpetuating health disparities. This study investigates the impact of stigmatizing language (SL) in EHR notes on mortality prediction using a Transformer-based deep learning model and explainable AI (XAI) techniques. Our findings demonstrate that SL written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. To explore an operationally efficient way to mitigate SL's impact, we investigate patterns in the generation of SL through a clinicians' collaborative network, identifying central clinicians as having a stronger impact on racial disparity in the AI model. We find that removing SL written by central clinicians is a more efficient bias reduction strategy than eliminating all SL in the entire corpus of data. This study provides actionable insights for responsible AI development and contributes to understanding clinician behavior and EHR note writing in healthcare.

翻译：电子健康记录（EHRs）是推动医疗领域人工智能（AI）变革的重要数据来源。然而，EHR记录中反映的临床医生偏见可能导致AI模型继承并放大这些偏见，从而加剧健康差异。本研究采用基于Transformer的深度学习模型和可解释AI（XAI）技术，探究EHR记录中污名化语言（SL）对死亡率预测的影响。研究结果表明，临床医生书写的SL内容会对AI性能产生不利影响，尤其是对黑人患者而言，这凸显了SL在AI模型开发中作为种族差异来源的作用。为探究降低SL影响的操作高效方法，我们通过临床医生协作网络分析SL的产生模式，发现中心临床医生对AI模型中的种族差异影响更大。研究表明，仅删除中心临床医生书写的SL内容，其偏差减少效率高于删除整个数据集中所有SL内容。本研究为负责任AI开发提供了可操作见解，并有助于深入理解临床医生行为及医疗领域的EHR记录撰写。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日