A Closer Look at Scoring Functions and Generalization Prediction - 专知论文

会员服务 ·

0

评分函数 · 误差估计 · 泛化 · 一致 · 有效性 ·

2023 年 3 月 23 日

A Closer Look at Scoring Functions and Generalization Prediction

翻译：评分函数与泛化预测的再审视

Puja Trivedi,Danai Koutra,Jayaraman J. Thiagarajan

from arxiv, Accepted to ICASSP 2023

Generalization error predictors (GEPs) aim to predict model performance on unseen distributions by deriving dataset-level error estimates from sample-level scores. However, GEPs often utilize disparate mechanisms (e.g., regressors, thresholding functions, calibration datasets, etc), to derive such error estimates, which can obfuscate the benefits of a particular scoring function. Therefore, in this work, we rigorously study the effectiveness of popular scoring functions (confidence, local manifold smoothness, model agreement), independent of mechanism choice. We find, absent complex mechanisms, that state-of-the-art confidence- and smoothness- based scores fail to outperform simple model-agreement scores when estimating error under distribution shifts and corruptions. Furthermore, on realistic settings where the training data has been compromised (e.g., label noise, measurement noise, undersampling), we find that model-agreement scores continue to perform well and that ensemble diversity is important for improving its performance. Finally, to better understand the limitations of scoring functions, we demonstrate that simplicity bias, or the propensity of deep neural networks to rely upon simple but brittle features, can adversely affect GEP performance. Overall, our work carefully studies the effectiveness of popular scoring functions in realistic settings and helps to better understand their limitations.

翻译：泛化误差预测器（GEP）旨在通过从样本级评分中推导数据集级误差估计，预测模型在未知分布上的性能。然而，GEP常采用不同机制（如回归器、阈值函数、校准数据集等）来推导此类误差估计，这可能掩盖特定评分函数的优势。因此，本研究严格考察了主流评分函数（置信度、局部流形平滑性、模型一致性）在独立于机制选择下的有效性。研究发现：在缺乏复杂机制时，基于置信度和平滑性的最优评分函数在分布偏移与数据损坏场景下进行误差估计时，并未超越简单的模型一致性评分。此外，在训练数据受损（如标签噪声、测量噪声、欠采样）的实际场景中，模型一致性评分仍表现优异，且集成多样性对其性能提升至关重要。最后，为深入理解评分函数的局限性，我们揭示了深度神经网络依赖简单但脆弱特征的简化偏差会不利影响GEP性能。本工作系统考察了主流评分函数在现实场景中的有效性，有助于更深入理解其局限性。

0

相关内容

评分函数

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

92+阅读 · 2020年7月4日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

专知会员服务

21+阅读 · 2020年2月11日

基于破坏和构造学习的细粒度图像识别（Destruction and Construction Learning for Fine-grained Image Recognition）

基于破坏和构造学习的细粒度图像识别（Destruction and Construction Learning for Fine-grained Image Recognition）

专知会员服务

20+阅读 · 2020年1月26日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

1+阅读 · 2022年6月10日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

全谱Förster型太阳能电池中的能量转移过程调控及其激子猝灭机制

国家自然科学基金

0+阅读 · 2015年12月31日

巨噬细胞上的Tim-3在阿司匹林诱导的动脉粥样硬化稳定斑块中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

水分对淀粉微波加热过程的介电增强作用及结晶结构的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于肺内AM、IM迁移的归肺经中药升降浮沉药性辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3调控GAPDH嵌入线粒体的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

ROS抑制DUSP6活性在ERK1/2诱导的放射性脑损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cortactin/actin介导幽门螺杆菌VacA转运至线粒体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

配合物为前驱体制备多级孔氧化物及其负载贵金属的催化氧化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

Learning-enhanced Nonlinear Model Predictive Control using Knowledge-based Neural Ordinary Differential Equations and Deep Ensembles

Arxiv

0+阅读 · 2023年5月16日

Revisiting Weighted Aggregation in Federated Learning with Neural Networks

Arxiv

0+阅读 · 2023年5月16日

MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection

Arxiv

0+阅读 · 2023年5月16日

Online and Offline Learning of Player Objectives from Partial Observations in Dynamic Games

Arxiv

0+阅读 · 2023年5月14日

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月14日

Tight and fast generalization error bound of graph embedding in metric space

Arxiv

0+阅读 · 2023年5月13日

Levenberg-Marquardt method with Singular Scaling and applications

Arxiv

0+阅读 · 2023年5月12日

SSD-MonoDTR: Supervised Scale-constrained Deformable Transformer for Monocular 3D Object Detection

Arxiv

0+阅读 · 2023年5月12日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

5+阅读 · 2023年5月12日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

VIP会员

文章信息

相关主题

最新内容

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

专知会员服务

4+阅读 · 7月23日

《基于强化学习的自动化红队测试》

《基于强化学习的自动化红队测试》

专知会员服务

3+阅读 · 7月23日

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

专知会员服务

6+阅读 · 7月23日

“天降毒雾”：无人机如何使化学战重返乌克兰战场

“天降毒雾”：无人机如何使化学战重返乌克兰战场

专知会员服务

2+阅读 · 7月23日

伊朗不对称防空战略的演进

伊朗不对称防空战略的演进

专知会员服务

4+阅读 · 7月23日

对抗环境下超视距目标打击的情报支援

对抗环境下超视距目标打击的情报支援

专知会员服务

10+阅读 · 7月22日

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

专知会员服务

4+阅读 · 7月22日

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

专知会员服务

8+阅读 · 7月22日

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

专知会员服务

10+阅读 · 7月22日

《无人机对海面作战影响评估》

《无人机对海面作战影响评估》

专知会员服务

15+阅读 · 7月21日

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

专知会员服务

14+阅读 · 7月21日

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

专知会员服务

4+阅读 · 7月21日

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

专知会员服务

6+阅读 · 7月21日

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

9+阅读 · 7月21日

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

7+阅读 · 7月20日

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

92+阅读 · 2020年7月4日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

【USC-Sean (Xiang) Ren】用解释和先验知识快速学习（Learning from Explanations with Neural Execution Tree），47页ppt

专知会员服务

21+阅读 · 2020年2月11日

基于破坏和构造学习的细粒度图像识别（Destruction and Construction Learning for Fine-grained Image Recognition）

基于破坏和构造学习的细粒度图像识别（Destruction and Construction Learning for Fine-grained Image Recognition）

专知会员服务

20+阅读 · 2020年1月26日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于强化学习的自动化红队测试》

“天降毒雾”：无人机如何使化学战重返乌克兰战场

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

1+阅读 · 2022年6月10日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

相关论文

Learning-enhanced Nonlinear Model Predictive Control using Knowledge-based Neural Ordinary Differential Equations and Deep Ensembles

Arxiv

0+阅读 · 2023年5月16日

Revisiting Weighted Aggregation in Federated Learning with Neural Networks

Arxiv

0+阅读 · 2023年5月16日

MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection

Arxiv

0+阅读 · 2023年5月16日

Online and Offline Learning of Player Objectives from Partial Observations in Dynamic Games

Arxiv

0+阅读 · 2023年5月14日

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月14日

Tight and fast generalization error bound of graph embedding in metric space

Arxiv

0+阅读 · 2023年5月13日

Levenberg-Marquardt method with Singular Scaling and applications

Arxiv

0+阅读 · 2023年5月12日

SSD-MonoDTR: Supervised Scale-constrained Deformable Transformer for Monocular 3D Object Detection

Arxiv

0+阅读 · 2023年5月12日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

5+阅读 · 2023年5月12日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

相关基金

全谱Förster型太阳能电池中的能量转移过程调控及其激子猝灭机制

国家自然科学基金

0+阅读 · 2015年12月31日

巨噬细胞上的Tim-3在阿司匹林诱导的动脉粥样硬化稳定斑块中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

水分对淀粉微波加热过程的介电增强作用及结晶结构的影响

国家自然科学基金

0+阅读 · 2013年12月31日

Catestatin蛋白肽段抑制动脉粥样硬化的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于肺内AM、IM迁移的归肺经中药升降浮沉药性辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3调控GAPDH嵌入线粒体的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

ROS抑制DUSP6活性在ERK1/2诱导的放射性脑损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cortactin/actin介导幽门螺杆菌VacA转运至线粒体的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

配合物为前驱体制备多级孔氧化物及其负载贵金属的催化氧化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员