Deep Equals Shallow for ReLU Networks in Kernel Regimes - 专知论文

会员服务 ·

0

核化 · Networking · 易处理的 · 近似 · ReLU ·

2021 年 3 月 17 日

Deep Equals Shallow for ReLU Networks in Kernel Regimes

翻译：内核制度 ReLU 网络的深等浅色

Alberto Bietti,Francis Bach

Deep networks are often considered to be more expressive than shallow ones in terms of approximation. Indeed, certain functions can be approximated by deep networks provably more efficiently than by shallow ones, however, no tractable algorithms are known for learning such deep models. Separately, a recent line of work has shown that deep networks trained with gradient descent may behave like (tractable) kernel methods in a certain over-parameterized regime, where the kernel is determined by the architecture and initialization, and this paper focuses on approximation for such kernels. We show that for ReLU activations, the kernels derived from deep fully-connected networks have essentially the same approximation properties as their shallow two-layer counterpart, namely the same eigenvalue decay for the corresponding integral operator. This highlights the limitations of the kernel framework for understanding the benefits of such deep architectures. Our main theoretical result relies on characterizing such eigenvalue decays through differentiability properties of the kernel function, which also easily applies to the study of other kernels defined on the sphere.

翻译：深层网络在近似方面往往被认为比浅层网络更清晰。事实上,某些功能可以比浅层网络更高效地被深层网络所近似,然而,在学习这种深层模型方面,没有已知的可移植算法。另外,最近的一项工作表明,受过梯度下降训练的深层网络在某种超分化制度中可能表现得像(可吸引的)内核方法,因为内核是由构造和初始化决定的,本文侧重于这类内核的近似。我们表明,对于RELU的激活,由深层完全连接网络产生的内核基本上具有与浅层对等的近似特性,即对相应整体操作者而言,其类值同样衰减。这凸显了内核框架在了解这种深层结构的惠益方面的局限性。我们的主要理论结果依赖于通过内核功能的可变性特性来定性这种脑值衰变。对于RELU的激活,从深层完全连接网络中产生的内核也很容易适用于对球体上定义的其他内核的研究。

0

相关内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

专知会员服务

59+阅读 · 2020年5月20日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

49+阅读 · 2020年2月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Kernel Thinning

Arxiv

0+阅读 · 2021年5月12日

Unbiased Monte Carlo Cluster Updates with Autoregressive Neural Networks

Arxiv

0+阅读 · 2021年5月12日

An efficient projection neural network for $\ell_1$-regularized logistic regression

Arxiv

0+阅读 · 2021年5月12日

A Bregman Learning Framework for Sparse Neural Networks

Arxiv

0+阅读 · 2021年5月10日

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Arxiv

0+阅读 · 2021年5月10日

Examining and Mitigating Kernel Saturation in Convolutional Neural Networks using Negative Images

Arxiv

0+阅读 · 2021年5月10日

Two-layer neural networks with values in a Banach space

Arxiv

0+阅读 · 2021年5月5日

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Arxiv

8+阅读 · 2019年11月4日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

Deep Convolutional Networks as shallow Gaussian Processes

Arxiv

4+阅读 · 2018年8月16日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

4+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

6+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

6+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

6+阅读 · 6月22日

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

8+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

22+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

专知会员服务

59+阅读 · 2020年5月20日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

【2020新书】算法与数据结构实战，286页pdf，Algorithms Data Structures in Action

专知会员服务

107+阅读 · 2020年2月22日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

49+阅读 · 2020年2月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Kernel Thinning

Arxiv

0+阅读 · 2021年5月12日

Unbiased Monte Carlo Cluster Updates with Autoregressive Neural Networks

Arxiv

0+阅读 · 2021年5月12日

An efficient projection neural network for $\ell_1$-regularized logistic regression

Arxiv

0+阅读 · 2021年5月12日

A Bregman Learning Framework for Sparse Neural Networks

Arxiv

0+阅读 · 2021年5月10日

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Arxiv

0+阅读 · 2021年5月10日

Examining and Mitigating Kernel Saturation in Convolutional Neural Networks using Negative Images

Arxiv

0+阅读 · 2021年5月10日

Two-layer neural networks with values in a Banach space

Arxiv

0+阅读 · 2021年5月5日

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Arxiv

8+阅读 · 2019年11月4日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

Deep Convolutional Networks as shallow Gaussian Processes

Arxiv

4+阅读 · 2018年8月16日

微信扫码咨询专知VIP会员