The Misuse of AUC: What High Impact Risk Assessment Gets Wrong - 专知论文

会员服务 ·

0

AUC · MoDELS · Performer · 模型验证 · 统计量 ·

2023 年 5 月 29 日

The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

翻译：AUC的误用：高风险影响评估中的错误

Kweku Kwegyir-Aggrey,Marissa Gerchick,Malika Mohan,Aaron Horowitz,Suresh Venkatasubramanian

When determining which machine learning model best performs some high impact risk assessment task, practitioners commonly use the Area under the Curve (AUC) to defend and validate their model choices. In this paper, we argue that the current use and understanding of AUC as a model performance metric misunderstands the way the metric was intended to be used. To this end, we characterize the misuse of AUC and illustrate how this misuse negatively manifests in the real world across several risk assessment domains. We locate this disconnect in the way the original interpretation of AUC has shifted over time to the point where issues pertaining to decision thresholds, class balance, statistical uncertainty, and protected groups remain unaddressed by AUC-based model comparisons, and where model choices that should be the purview of policymakers are hidden behind the veil of mathematical rigor. We conclude that current model validation practices involving AUC are not robust, and often invalid.

翻译：在确定哪种机器学习模型最适合执行某项高风险影响评估任务时，从业者通常使用曲线下面积（AUC）来论证和验证其模型选择。本文认为，当前将AUC作为模型性能指标的使用和理解方式，误解了该指标原本的用途。为此，我们刻画了AUC的误用现象，并说明了这种误用如何在多个风险评估领域中实际产生负面影响。我们定位了这一脱节源于AUC原始解释随时间推移而产生的偏移，导致基于AUC的模型比较无法解决决策阈值、类别平衡、统计不确定性以及受保护群体等关键问题，同时，原本应由政策制定者决定的模型选择被隐藏在数学严谨性的帷幕之后。我们的结论是，当前涉及AUC的模型验证实践既不稳健，也常常是无效的。

0

相关内容

AUC

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

神经精神性高血压NE/机械力联合介导VSMC-α1-ARs信号加速促进移植静脉粥样硬化及机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

五脏温阳化瘀汤对PI3K/Akt-mTOR通路介导动脉粥样硬化型血管性痴呆自噬与凋亡的效应机制

国家自然科学基金

0+阅读 · 2014年12月31日

利用热电桥法研究悬空单层石墨烯的声子热输运性质

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

ROC1活化mTOR通路促进膀胱癌侵袭及转移的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

PANDER-FOXO1信号通路在非酒精性脂肪肝发生过程中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

EMCD和ALCHEMI研究单个DMS纳米结构的铁磁性内禀属性

国家自然科学基金

0+阅读 · 2009年12月31日

Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping

Arxiv

0+阅读 · 2023年7月17日

Predicting Battery Lifetime Under Varying Usage Conditions from Early Aging Data

Arxiv

0+阅读 · 2023年7月17日

Unveiling Bias in Sequential Decision Making: A Causal Inference Approach for Stochastic Service Systems

Arxiv

0+阅读 · 2023年7月15日

Combinatorial Pen Testing (or Consumer Surplus of Deferred-Acceptance Auctions)

Arxiv

0+阅读 · 2023年7月14日

Causal Regularization: On the trade-off between in-sample risk and out-of-sample risk guarantees

Arxiv

0+阅读 · 2023年7月14日

The Butterfly Effect in AI Fairness and Bias

Arxiv

0+阅读 · 2023年7月13日

Curve Fitting Simplified: Exploring the Intuitive Features of CurvPy

Arxiv

0+阅读 · 2023年6月24日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

VIP会员

文章信息

相关主题

最新内容

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

5+阅读 · 今天8:00

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

3+阅读 · 今天7:44

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

3+阅读 · 今天7:28

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

4+阅读 · 今天7:18

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

5+阅读 · 今天7:07

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

4+阅读 · 今天7:03

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

5+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

10+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

4+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

8+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

7+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

4+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

重新思考无人机时代的生存能力

在人工智能加速决策环境中拓展OODA循环

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

装甲突击旅：现代战争思考、战斗与组织

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping

Arxiv

0+阅读 · 2023年7月17日

Predicting Battery Lifetime Under Varying Usage Conditions from Early Aging Data

Arxiv

0+阅读 · 2023年7月17日

Unveiling Bias in Sequential Decision Making: A Causal Inference Approach for Stochastic Service Systems

Arxiv

0+阅读 · 2023年7月15日

Combinatorial Pen Testing (or Consumer Surplus of Deferred-Acceptance Auctions)

Arxiv

0+阅读 · 2023年7月14日

Causal Regularization: On the trade-off between in-sample risk and out-of-sample risk guarantees

Arxiv

0+阅读 · 2023年7月14日

The Butterfly Effect in AI Fairness and Bias

Arxiv

0+阅读 · 2023年7月13日

Curve Fitting Simplified: Exploring the Intuitive Features of CurvPy

Arxiv

0+阅读 · 2023年6月24日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

神经精神性高血压NE/机械力联合介导VSMC-α1-ARs信号加速促进移植静脉粥样硬化及机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

五脏温阳化瘀汤对PI3K/Akt-mTOR通路介导动脉粥样硬化型血管性痴呆自噬与凋亡的效应机制

国家自然科学基金

0+阅读 · 2014年12月31日

利用热电桥法研究悬空单层石墨烯的声子热输运性质

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

ROC1活化mTOR通路促进膀胱癌侵袭及转移的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

PANDER-FOXO1信号通路在非酒精性脂肪肝发生过程中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

EMCD和ALCHEMI研究单个DMS纳米结构的铁磁性内禀属性

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员