Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases - 专知论文

会员服务 ·

0

贝叶斯 · 过度参数化 · ReLU · 参数化 · 自由能 ·

2023 年 3 月 28 日

Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases

翻译：Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases

Shuya Nagayasu,Sumio Watanabe

from arxiv, 20pages, 2figure

In many research fields in artificial intelligence, it has been shown that deep neural networks are useful to estimate unknown functions on high dimensional input spaces. However, their generalization performance is not yet completely clarified from the theoretical point of view because they are nonidentifiable and singular learning machines. Moreover, a ReLU function is not differentiable, to which algebraic or analytic methods in singular learning theory cannot be applied. In this paper, we study a deep ReLU neural network in overparametrized cases and prove that the Bayesian free energy, which is equal to the minus log marginal likelihoodor the Bayesian stochastic complexity, is bounded even if the number of layers are larger than necessary to estimate an unknown data-generating function. Since the Bayesian generalization error is equal to the increase of the free energy as a function of a sample size, our result also shows that the Bayesian generalization error does not increase even if a deep ReLU neural network is designed to be sufficiently large or in an opeverparametrized state.

翻译：在人工智能的众多研究领域中，深度神经网络已被证明对估计高维输入空间中的未知函数十分有效。然而，由于它们是非恒等且奇异的学习机器，其泛化性能尚未从理论角度得到完全阐明。此外，ReLU函数不可微，因此奇异学习理论中的代数或分析方法无法应用于此。本文研究了过参数化情况下的深度ReLU神经网络，并证明了即使网络层数多于估计未知数据生成函数所需的必要数量，贝叶斯自由能（即负对数边际似然或贝叶斯随机复杂度）仍然有界。由于贝叶斯泛化误差等于自由能随样本量的增量，我们的结果还表明，即使深度ReLU神经网络被设计得足够大或处于过参数化状态，其贝叶斯泛化误差也不会增加。

0

相关内容

贝叶斯

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

专知会员服务

40+阅读 · 2022年3月19日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

17+阅读 · 2020年6月4日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

69+阅读 · 2020年5月9日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

神经网络高斯过程 (Neural Network Gaussian Process)

神经网络高斯过程 (Neural Network Gaussian Process)

PaperWeekly

0+阅读 · 2022年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

贝叶斯神经网络(系列)第一篇

贝叶斯神经网络(系列)第一篇

AI研习社

14+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

两类典型脉冲延时神经网络的Hopf分岔研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于SA诱导番茄抗灰霉病的SR/CAMTA转录因子作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

动静载荷作用下基于压缩感知域InSAR时间序列分析监测京津高铁沿线地面沉降

国家自然科学基金

0+阅读 · 2012年12月31日

电子束产生开放空间等离子体物理过程研究

国家自然科学基金

0+阅读 · 2012年12月31日

交集上变分不等式的神经网络模型及应用研究

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

时域特性中离子束成形特性的数值模拟

国家自然科学基金

0+阅读 · 2008年12月31日

基于局部平均采样的多维随机场景重构原理与方法

国家自然科学基金

0+阅读 · 2008年12月31日

A Measure of the Complexity of Neural Representations based on Partial Information Decomposition

Arxiv

0+阅读 · 2023年5月17日

Learning Likelihood Ratios with Neural Network Classifiers

Arxiv

0+阅读 · 2023年5月17日

Deep quantum neural networks form Gaussian processes

Arxiv

0+阅读 · 2023年5月17日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

Arxiv

23+阅读 · 2021年9月29日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

The Modern Mathematics of Deep Learning

Arxiv

49+阅读 · 2021年5月9日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

过度参数化

最新内容

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

专知会员服务

2+阅读 · 今天7:13

俄乌无人机战争的六大启示

俄乌无人机战争的六大启示

专知会员服务

4+阅读 · 今天7:07

《无人机空中监控：通信实验洞察》

《无人机空中监控：通信实验洞察》

专知会员服务

3+阅读 · 今天7:05

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

专知会员服务

3+阅读 · 今天6:59

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

12+阅读 · 8月2日

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

5+阅读 · 8月2日

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

10+阅读 · 8月2日

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

12+阅读 · 8月2日

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

6+阅读 · 8月2日

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

10+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

8+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

9+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

8+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

6+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

13+阅读 · 7月31日

相关VIP内容

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

【深度学习中的不确定性-贝叶斯CNN | TensorFlow概率】Uncertainty In Deep Learning — Bayesian CNN | TensorFlow Probability

专知会员服务

40+阅读 · 2022年3月19日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

17+阅读 · 2020年6月4日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

69+阅读 · 2020年5月9日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌无人机战争的六大启示

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《无人机空中监控：通信实验洞察》

相关资讯

神经网络高斯过程 (Neural Network Gaussian Process)

神经网络高斯过程 (Neural Network Gaussian Process)

PaperWeekly

0+阅读 · 2022年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

贝叶斯神经网络(系列)第一篇

贝叶斯神经网络(系列)第一篇

AI研习社

14+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

A Measure of the Complexity of Neural Representations based on Partial Information Decomposition

Arxiv

0+阅读 · 2023年5月17日

Learning Likelihood Ratios with Neural Network Classifiers

Arxiv

0+阅读 · 2023年5月17日

Deep quantum neural networks form Gaussian processes

Arxiv

0+阅读 · 2023年5月17日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

Arxiv

23+阅读 · 2021年9月29日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

The Modern Mathematics of Deep Learning

Arxiv

49+阅读 · 2021年5月9日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

相关基金

两类典型脉冲延时神经网络的Hopf分岔研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于SA诱导番茄抗灰霉病的SR/CAMTA转录因子作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

动静载荷作用下基于压缩感知域InSAR时间序列分析监测京津高铁沿线地面沉降

国家自然科学基金

0+阅读 · 2012年12月31日

电子束产生开放空间等离子体物理过程研究

国家自然科学基金

0+阅读 · 2012年12月31日

交集上变分不等式的神经网络模型及应用研究

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

时域特性中离子束成形特性的数值模拟

国家自然科学基金

0+阅读 · 2008年12月31日

基于局部平均采样的多维随机场景重构原理与方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员