Does it pay to optimize AUC? - 专知论文

会员服务 ·

0

AUC · 优化器 · 可约的 · binary · ROC ·

2023 年 6 月 2 日

Does it pay to optimize AUC?

翻译：优化AUC是否值得？

Baojian Zhou,Steven Skiena

from arxiv, 16 pages, AAAI

The Area Under the ROC Curve (AUC) is an important model metric for evaluating binary classifiers, and many algorithms have been proposed to optimize AUC approximately. It raises the question of whether the generally insignificant gains observed by previous studies are due to inherent limitations of the metric or the inadequate quality of optimization. To better understand the value of optimizing for AUC, we present an efficient algorithm, namely AUC-opt, to find the provably optimal AUC linear classifier in $\mathbb{R}^2$, which runs in $\mathcal{O}(n_+ n_- \log (n_+ n_-))$ where $n_+$ and $n_-$ are the number of positive and negative samples respectively. Furthermore, it can be naturally extended to $\mathbb{R}^d$ in $\mathcal{O}((n_+n_-)^{d-1}\log (n_+n_-))$ by calling AUC-opt in lower-dimensional spaces recursively. We prove the problem is NP-complete when $d$ is not fixed, reducing from the \textit{open hemisphere problem}. Experiments show that compared with other methods, AUC-opt achieves statistically significant improvements on between 17 to 40 in $\mathbb{R}^2$ and between 4 to 42 in $\mathbb{R}^3$ of 50 t-SNE training datasets. However, generally the gain proves insignificant on most testing datasets compared to the best standard classifiers. Similar observations are found for nonlinear AUC methods under real-world datasets.

翻译：ROC曲线下面积（AUC）是评估二分类器的重要模型指标，已有众多算法被提出用于近似优化AUC。这引发了一个问题：先前研究观察到的普遍不显著增益，究竟是源于指标本身的固有局限性，还是优化方法的质量不足？为更深入理解AUC优化的价值，我们提出一种高效算法AUC-opt，可在$\mathbb{R}^2$中找到可证明最优的AUC线性分类器，其运行时间为$\mathcal{O}(n_+ n_- \log (n_+ n_-))$，其中$n_+$和$n_-$分别代表正负样本数量。此外，通过递归调用低维空间中的AUC-opt，该算法可自然扩展至$\mathbb{R}^d$，时间复杂度为$\mathcal{O}((n_+n_-)^{d-1}\log (n_+n_-))$。我们证明当$d$不固定时，该问题属于NP完全问题（归约自\textit{开半球问题}）。实验表明，在50个t-SNE训练数据集上，相较于其他方法，AUC-opt在$\mathbb{R}^2$和$\mathbb{R}^3$中分别于17至40个和4至42个数据集上取得统计显著的性能提升。然而，在大多数测试数据集上，相较于最佳标准分类器，这种增益通常并不显著。在真实世界数据集上，非线性AUC方法也呈现出类似现象。

0

相关内容

AUC

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

73+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

基于间断petrov有限元的Trefftz方法及其在雷达散射截面中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

数域上的椭圆曲线与整数分解

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于尾波干涉的混凝土结构应力场非加卸载式测量研究

国家自然科学基金

0+阅读 · 2014年12月31日

纳米颗粒与持久性有机污染物的复合毒性及其机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

鞘氨醇代谢通路在早期胚胎转运和发育及输卵管妊娠发生中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

探索VASH2转录激活对肝细胞癌血管生成和上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

图在曲面上嵌入的分类

国家自然科学基金

0+阅读 · 2011年12月31日

胰岛素抵抗在非酒精性脂肪性肝病发生中的作用途径及中药干预研究

国家自然科学基金

0+阅读 · 2008年12月31日

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

Arxiv

0+阅读 · 2023年7月24日

Limit Theorems and Phase Transitions in the Tensor Curie-Weiss Potts Model

Arxiv

0+阅读 · 2023年7月24日

Estimate-Then-Optimize versus Integrated-Estimation-Optimization versus Sample Average Approximation: A Stochastic Dominance Perspective

Arxiv

0+阅读 · 2023年7月23日

The Sample Complexity of Multi-Distribution Learning for VC Classes

Arxiv

0+阅读 · 2023年7月22日

Selective inference for clustering with unknown variance

Selective inference for clustering with unknown variance

Arxiv

0+阅读 · 2023年7月21日

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Arxiv

0+阅读 · 2023年7月20日

Robust Principal Component Analysis: A Median of Means Approach

Arxiv

0+阅读 · 2023年7月20日

A stochastic optimization approach to minimize robust density power-based divergences for general parametric density models

Arxiv

0+阅读 · 2023年7月20日

Differentially Flat Learning-based Model Predictive Control Using a Stability, State, and Input Constraining Safety Filter

Arxiv

0+阅读 · 2023年7月20日

Randomizing the trapezoidal rule gives the optimal RMSE rate in Gaussian Sobolev spaces

Arxiv

0+阅读 · 2023年7月20日

VIP会员

文章信息

相关主题

最新内容

《面向指挥控制训练与实时北约兼容数据分发的战术模拟器》

《面向指挥控制训练与实时北约兼容数据分发的战术模拟器》

专知会员服务

2+阅读 · 今天5:21

《决策模型比较研究》

《决策模型比较研究》

专知会员服务

7+阅读 · 今天5:16

全球军事与武器工业中的人工智能：应用、方法与影响（万字长文）

全球军事与武器工业中的人工智能：应用、方法与影响（万字长文）

专知会员服务

3+阅读 · 今天4:37

《美军水下战与海床战概述及本地实施》

《美军水下战与海床战概述及本地实施》

专知会员服务

3+阅读 · 今天4:30

面向未来冲突推进陆军情报体制改革

面向未来冲突推进陆军情报体制改革

专知会员服务

3+阅读 · 今天4:12

人工智能赋能无人机：俄乌冲突案例及其深远影响（万字长文）

人工智能赋能无人机：俄乌冲突案例及其深远影响（万字长文）

专知会员服务

4+阅读 · 今天2:54

《反无人机蜂群：有人-无人协同防御场景下的编队重构分析》

《反无人机蜂群：有人-无人协同防御场景下的编队重构分析》

专知会员服务

8+阅读 · 7月24日

《史诗怒火/咆哮雄狮行动：针对伊朗空中战役的战略分析》68页智库报告

《史诗怒火/咆哮雄狮行动：针对伊朗空中战役的战略分析》68页智库报告

专知会员服务

7+阅读 · 7月24日

“愈演愈烈的欺骗与干扰博弈”：无人机与人工智能背景下俄乌强化以无人机为核心的电子战

“愈演愈烈的欺骗与干扰博弈”：无人机与人工智能背景下俄乌强化以无人机为核心的电子战

专知会员服务

5+阅读 · 7月24日

乌克兰纵深打击如何重塑俄罗斯的战略选择

乌克兰纵深打击如何重塑俄罗斯的战略选择

专知会员服务

3+阅读 · 7月24日

《分布式太空任务对比分析与综合建模及仿真环境》120页

《分布式太空任务对比分析与综合建模及仿真环境》120页

专知会员服务

3+阅读 · 7月24日

俄乌战争中关于中程打击无人机部署的经验启示

俄乌战争中关于中程打击无人机部署的经验启示

专知会员服务

4+阅读 · 7月24日

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

专知会员服务

5+阅读 · 7月23日

《基于强化学习的自动化红队测试》

《基于强化学习的自动化红队测试》

专知会员服务

5+阅读 · 7月23日

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

专知会员服务

8+阅读 · 7月23日

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

73+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《决策模型比较研究》

《美军水下战与海床战概述及本地实施》

《面向指挥控制训练与实时北约兼容数据分发的战术模拟器》

全球军事与武器工业中的人工智能：应用、方法与影响（万字长文）

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

相关论文

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

Arxiv

0+阅读 · 2023年7月24日

Limit Theorems and Phase Transitions in the Tensor Curie-Weiss Potts Model

Arxiv

0+阅读 · 2023年7月24日

Estimate-Then-Optimize versus Integrated-Estimation-Optimization versus Sample Average Approximation: A Stochastic Dominance Perspective

Arxiv

0+阅读 · 2023年7月23日

The Sample Complexity of Multi-Distribution Learning for VC Classes

Arxiv

0+阅读 · 2023年7月22日

Selective inference for clustering with unknown variance

Selective inference for clustering with unknown variance

Arxiv

0+阅读 · 2023年7月21日

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Arxiv

0+阅读 · 2023年7月20日

Robust Principal Component Analysis: A Median of Means Approach

Arxiv

0+阅读 · 2023年7月20日

A stochastic optimization approach to minimize robust density power-based divergences for general parametric density models

Arxiv

0+阅读 · 2023年7月20日

Differentially Flat Learning-based Model Predictive Control Using a Stability, State, and Input Constraining Safety Filter

Arxiv

0+阅读 · 2023年7月20日

Randomizing the trapezoidal rule gives the optimal RMSE rate in Gaussian Sobolev spaces

Arxiv

0+阅读 · 2023年7月20日

相关基金

基于间断petrov有限元的Trefftz方法及其在雷达散射截面中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

数域上的椭圆曲线与整数分解

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于尾波干涉的混凝土结构应力场非加卸载式测量研究

国家自然科学基金

0+阅读 · 2014年12月31日

纳米颗粒与持久性有机污染物的复合毒性及其机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

鞘氨醇代谢通路在早期胚胎转运和发育及输卵管妊娠发生中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

探索VASH2转录激活对肝细胞癌血管生成和上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

图在曲面上嵌入的分类

国家自然科学基金

0+阅读 · 2011年12月31日

胰岛素抵抗在非酒精性脂肪性肝病发生中的作用途径及中药干预研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员