Diagnosing Model Performance Under Distribution Shift - 专知论文

会员服务 ·

0

Performer · 模型性能 · MoDELS · 样例 · 图片分类 ·

2023 年 3 月 3 日

Diagnosing Model Performance Under Distribution Shift

翻译：分布偏移下的模型性能诊断

Tiffany, Cai,Hongseok Namkoong,Steve Yadlowsky

Prediction models can perform poorly when deployed to target distributions different from the training distribution. To understand these operational failure modes, we develop a method, called DIstribution Shift DEcomposition (DISDE), to attribute a drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into terms for 1) an increase in harder but frequently seen examples from training, 2) changes in the relationship between features and outcomes, and 3) poor performance on examples infrequent or unseen during training. These terms are defined by fixing a distribution on $X$ while varying the conditional distribution of $Y \mid X$ between training and target, or by fixing the conditional distribution of $Y \mid X$ while varying the distribution on $X$. In order to do this, we define a hypothetical distribution on $X$ consisting of values common in both training and target, over which it is easy to compare $Y \mid X$ and thus predictive performance. We estimate performance on this hypothetical distribution via reweighting methods. Empirically, we show how our method can 1) inform potential modeling improvements across distribution shifts for employment prediction on tabular census data, and 2) help to explain why certain domain adaptation methods fail to improve model performance for satellite image classification.

翻译：预测模型在部署到与训练分布不同的目标分布时，其性能可能显著下降。为理解这些操作失效模式，我们提出了一种名为分布偏移分解（DISDE）的方法，将性能下降归因于不同类型的分布偏移。我们的方法将性能下降分解为三个分量：1）训练集中常见但难度较高样本的增加；2）特征与结果之间关系的变化；3）训练期间不常见或未见样本上的性能不足。这些分量的定义基于固定$X$的分布并变动训练集与目标集之间$Y \mid X$的条件分布，或固定$Y \mid X$的条件分布并变动$X$的分布。为此，我们定义了一个由训练集和目标集均常见的取值构成的$X$上的假设分布，在此分布上可便捷比较$Y \mid X$及预测性能。通过重加权方法，我们估计该假设分布上的性能。实验表明，我们的方法能够：1）针对表格化人口普查数据的就业预测，揭示不同分布偏移下的潜在建模改进方向；2）解释为何某些域自适应方法未能提升卫星图像分类的模型性能。

0

相关内容

Performer

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

66+阅读 · 2023年2月15日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

109+阅读 · 2020年5月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

CXXC4-AMOTL2-YAP轴在调控胃癌细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

相场方程的弱超内罚间断Galerkin方法及其自适应算法

国家自然科学基金

1+阅读 · 2015年12月31日

高维近似因子模型框架下的多重检验及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

Foxl2在三疣梭子蟹卵巢发育中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

纤维结构不良中破骨细胞过度激活的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

松香改性壳聚糖阳离子表面活性剂合成及其构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

GCM磷酸化调控果蝇胚胎胶质细胞发育与功能的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Galectin-3对肝星状细胞激活及凋亡的影响

国家自然科学基金

0+阅读 · 2008年12月31日

Latent Traversals in Generative Models as Potential Flows

Arxiv

0+阅读 · 2023年4月25日

Differential Privacy via Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年4月25日

Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression

Arxiv

0+阅读 · 2023年4月25日

Theory of Posterior Concentration for Generalized Bayesian Additive Regression Trees

Arxiv

0+阅读 · 2023年4月25日

Addressing distributional shifts in operations management: The case of order fulfillment in customized production

Arxiv

0+阅读 · 2023年4月24日

A Lightweight Constrained Generation Alternative for Query-focused Summarization

Arxiv

0+阅读 · 2023年4月23日

Towards Understanding Feature Learning in Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年4月22日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

VIP会员

文章信息

相关主题

最新内容

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

3+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

2+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

8+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

6+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

5+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

6+阅读 · 6月24日

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

6+阅读 · 6月24日

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

5+阅读 · 6月24日

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

7+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

12+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

6+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

9+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

8+阅读 · 6月23日

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

66+阅读 · 2023年2月15日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

109+阅读 · 2020年5月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Agentic RL：框架、实践与长程智能体训练

重新思考无人机时代的生存能力

综述 | 从问答到任务完成：Agent系统与Harness设计

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Latent Traversals in Generative Models as Potential Flows

Arxiv

0+阅读 · 2023年4月25日

Differential Privacy via Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年4月25日

Nonlinear Sufficient Dimension Reduction for Distribution-on-Distribution Regression

Arxiv

0+阅读 · 2023年4月25日

Theory of Posterior Concentration for Generalized Bayesian Additive Regression Trees

Arxiv

0+阅读 · 2023年4月25日

Addressing distributional shifts in operations management: The case of order fulfillment in customized production

Arxiv

0+阅读 · 2023年4月24日

A Lightweight Constrained Generation Alternative for Query-focused Summarization

Arxiv

0+阅读 · 2023年4月23日

Towards Understanding Feature Learning in Out-of-Distribution Generalization

Arxiv

0+阅读 · 2023年4月22日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

相关基金

CXXC4-AMOTL2-YAP轴在调控胃癌细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

相场方程的弱超内罚间断Galerkin方法及其自适应算法

国家自然科学基金

1+阅读 · 2015年12月31日

高维近似因子模型框架下的多重检验及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

Foxl2在三疣梭子蟹卵巢发育中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

纤维结构不良中破骨细胞过度激活的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

松香改性壳聚糖阳离子表面活性剂合成及其构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

GCM磷酸化调控果蝇胚胎胶质细胞发育与功能的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Galectin-3对肝星状细胞激活及凋亡的影响

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员