Subsampled Ensemble Can Improve Generalization Tail Exponentially - 专知论文

会员服务 ·

0

集成 · MoDELS · 子采样 · 基 · 方差 ·

2025 年 2 月 1 日

Subsampled Ensemble Can Improve Generalization Tail Exponentially

翻译：子采样集成可指数级提升泛化尾部性能

Huajie Qian,Donghao Ying,Henry Lam,Wotao Yin

from arxiv, 42 pages, 18 figures

Ensemble learning is a popular technique to improve the accuracy of machine learning models. It traditionally hinges on the rationale that aggregating multiple weak models can lead to better models with lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on ensembling. By selecting the best model trained on subsamples via majority voting, we can attain exponentially decaying tails for the excess risk, even if the base learner suffers from slow (i.e., polynomial) decay rates. This tail enhancement power of ensembling is agnostic to the underlying base learner and is stronger than variance reduction in the sense of exhibiting rate improvement. We demonstrate how our ensemble methods can substantially improve out-of-sample performances in a range of numerical examples involving heavy-tailed data or intrinsically slow rates. Code for the proposed methods is available at https://github.com/mickeyhqian/VoteEnsemble.

翻译：集成学习是一种提升机器学习模型准确性的常用技术。传统上，其核心原理在于：聚合多个弱模型可以获得方差更低、稳定性更高的更优模型，尤其对于不连续的基学习器而言。本文为集成学习提供了一个新的视角。通过对子样本训练的最佳模型进行多数投票选择，我们能够使超额风险的尾部以指数速率衰减，即使基学习器本身仅具有较慢（即多项式）的衰减速率。这种集成学习的尾部增强能力不依赖于底层基学习器，并且在体现速率改进的意义上，比方差缩减更为强大。我们通过一系列涉及重尾数据或固有慢速率的数值算例，展示了所提出的集成方法如何显著提升样本外性能。所提方法的代码可在 https://github.com/mickeyhqian/VoteEnsemble 获取。

0

相关内容

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

18+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

A Novel Approach To Implementing Knowledge Distillation In Tsetlin Machines

A Novel Approach To Implementing Knowledge Distillation In Tsetlin Machines

Arxiv

0+阅读 · 2025年4月2日

Optimization Insights into Deep Diagonal Linear Networks

Optimization Insights into Deep Diagonal Linear Networks

Arxiv

0+阅读 · 2025年4月1日

Reducing Differential Item Functioning via Process Data

Arxiv

0+阅读 · 2025年3月31日

Punctuation Restoration Improves Structure Understanding Without Supervision

Arxiv

0+阅读 · 2025年3月30日

Exploring Next Token Prediction For Optimizing Databases

Exploring Next Token Prediction For Optimizing Databases

Arxiv

0+阅读 · 2025年3月28日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

How Framelets Enhance Graph Neural Networks

Arxiv

21+阅读 · 2021年2月13日

Scalable Graph Neural Networks via Bidirectional Propagation

Arxiv

16+阅读 · 2020年10月29日

Relational Deep Reinforcement Learning

Relational Deep Reinforcement Learning

Arxiv

10+阅读 · 2018年6月28日

Chinese NER Using Lattice LSTM

Arxiv

14+阅读 · 2018年5月15日

VIP会员

文章信息

相关主题

最新内容

【斯坦福博士论文】语言模型的机械可解释性与控制

【斯坦福博士论文】语言模型的机械可解释性与控制

专知会员服务

0+阅读 · 今天13:13

大语言模型智能体长期记忆安全性综述：迈向记忆主权

大语言模型智能体长期记忆安全性综述：迈向记忆主权

专知会员服务

0+阅读 · 今天13:08

美军被摧毁的空战装备：伊朗战争如何重创美国空中力量

美军被摧毁的空战装备：伊朗战争如何重创美国空中力量

专知会员服务

3+阅读 · 今天7:11

人工智能赋能无人机：俄乌战争（万字长文）

人工智能赋能无人机：俄乌战争（万字长文）

专知会员服务

5+阅读 · 今天6:56

国外海军作战管理系统与作战训练系统

国外海军作战管理系统与作战训练系统

专知会员服务

2+阅读 · 今天4:16

美军条令《海军陆战队规划流程（2026版）》

美军条令《海军陆战队规划流程（2026版）》

专知会员服务

10+阅读 · 今天3:36

《压缩式分布式交互仿真标准》120页

《压缩式分布式交互仿真标准》120页

专知会员服务

4+阅读 · 今天3:21

《电子战数据交换模型研究报告》

《电子战数据交换模型研究报告》

专知会员服务

6+阅读 · 今天3:13

美军运用水下无人机与机器人系统竞速清除霍尔木兹海峡水雷

美军运用水下无人机与机器人系统竞速清除霍尔木兹海峡水雷

专知会员服务

4+阅读 · 今天2:55

《基于Transformer的异常舰船导航识别与跟踪》80页

《基于Transformer的异常舰船导航识别与跟踪》80页

专知会员服务

8+阅读 · 今天2:45

《美国太空系统司令部实验室原型作战管理系统的数据与决策可追溯性》

《美国太空系统司令部实验室原型作战管理系统的数据与决策可追溯性》

专知会员服务

6+阅读 · 今天2:41

《低数据领域军事目标检测模型研究》

《低数据领域军事目标检测模型研究》

专知会员服务

6+阅读 · 今天2:37

《为韧性而设计：在战略不确定时代提升军事空军基地的生存能力》

《为韧性而设计：在战略不确定时代提升军事空军基地的生存能力》

专知会员服务

6+阅读 · 今天2:32

【CMU博士论文】物理世界的视觉感知与深度理解

【CMU博士论文】物理世界的视觉感知与深度理解

专知会员服务

10+阅读 · 4月22日

多智能体系统：从经典范式到大基础模型驱动的未来

多智能体系统：从经典范式到大基础模型驱动的未来

专知会员服务

17+阅读 · 4月22日

相关VIP内容

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体长期记忆安全性综述：迈向记忆主权

人工智能赋能无人机：俄乌战争（万字长文）

【斯坦福博士论文】语言模型的机械可解释性与控制

美军被摧毁的空战装备：伊朗战争如何重创美国空中力量

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

相关论文

A Novel Approach To Implementing Knowledge Distillation In Tsetlin Machines

A Novel Approach To Implementing Knowledge Distillation In Tsetlin Machines

Arxiv

0+阅读 · 2025年4月2日

Optimization Insights into Deep Diagonal Linear Networks

Optimization Insights into Deep Diagonal Linear Networks

Arxiv

0+阅读 · 2025年4月1日

Reducing Differential Item Functioning via Process Data

Arxiv

0+阅读 · 2025年3月31日

Punctuation Restoration Improves Structure Understanding Without Supervision

Arxiv

0+阅读 · 2025年3月30日

Exploring Next Token Prediction For Optimizing Databases

Exploring Next Token Prediction For Optimizing Databases

Arxiv

0+阅读 · 2025年3月28日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

How Framelets Enhance Graph Neural Networks

Arxiv

21+阅读 · 2021年2月13日

Scalable Graph Neural Networks via Bidirectional Propagation

Arxiv

16+阅读 · 2020年10月29日

Relational Deep Reinforcement Learning

Relational Deep Reinforcement Learning

Arxiv

10+阅读 · 2018年6月28日

Chinese NER Using Lattice LSTM

Arxiv

14+阅读 · 2018年5月15日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

18+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员