Multigrade Neural Network Approximation

We study multigrade deep learning (MGDL) as a principled framework for structured error refinement in deep neural networks. While the approximation power of neural networks is now relatively well understood, training very deep architectures remains challenging due to highly non-convex and often ill-conditioned optimization landscapes. In contrast, for relatively shallow networks, most notably one-hidden-layer $\texttt{ReLU}$ models, training admits convex reformulations with global guarantees, motivating learning paradigms that improve stability while scaling to depth. MGDL builds upon this insight by training deep networks grade by grade: previously learned grades are frozen, and each new residual block is trained solely to reduce the remaining approximation error, yielding an interpretable and stable hierarchical refinement process. We develop an operator-theoretic foundation for MGDL and prove that, for any continuous target function, there exists a fixed-width multigrade $\texttt{ReLU}$ scheme whose residuals decrease strictly across grades and converge uniformly to zero. To the best of our knowledge, this work provides the first rigorous theoretical guarantee that grade-wise training yields provable vanishing approximation error in deep networks. Numerical experiments further illustrate the theoretical results.

翻译：我们研究多级深度学习（MGDL）作为一种结构化误差精炼框架，用于深度神经网络。尽管神经网络的逼近能力现已相对明确，但由于高度非凸且条件数差的优化地形，训练深层架构仍具挑战性。相比之下，对于较浅的网络，尤其是单隐层$\texttt{ReLU}$模型，训练可转化为具有全局保证的凸优化问题，这启发了一种在扩展深度时提升稳定性的学习范式。MGDL基于此洞见，通过逐级训练深层网络：先前学习的层级被冻结，每个新增残差块仅用于减少剩余逼近误差，从而形成可解释且稳定的分层精炼过程。我们为MGDL建立了算子理论基础，并证明：对任意连续目标函数，存在固定宽度的多级$\texttt{ReLU}$方案，其残差随层级严格递减且一致收敛于零。据我们所知，本研究首次提供了严格的理论保证，证明逐级训练可在深度网络中实现可证明的逼近误差消逝。数值实验进一步验证了理论结果。

相关内容

神经网络

关注 5918

人工神经网络（Artificial Neural Network，即ANN ），是20世纪80 年代以来人工智能领域兴起的研究热点。它从信息处理角度对人脑神经元网络进行抽象，建立某种简单模型，按不同的连接方式组成不同的网络。在工程与学术界也常直接简称为神经网络或类神经网络。神经网络是一种运算模型，由大量的节点（或称神经元）之间相互联接构成。每个节点代表一种特定的输出函数，称为激励函数（activation function）。每两个节点间的连接都代表一个对于通过该连接信号的加权值，称之为权重，这相当于人工神经网络的记忆。网络的输出则依网络的连接方式，权重值和激励函数的不同而不同。而网络自身通常都是对自然界某种算法或者函数的逼近，也可能是对一种逻辑策略的表达。最近十多年来，人工神经网络的研究工作不断深入，已经取得了很大的进展，其在模式识别、智能机器人、自动控制、预测估计、生物、医学、经济等领域已成功地解决了许多现代计算机难以解决的实际问题，表现出了良好的智能特性。

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

专知会员服务

16+阅读 · 2022年3月17日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google AI】鲁棒图神经网络，Robust Graph Neural Networks

专知会员服务

38+阅读 · 2022年3月9日

多伦多大学2021《机器学习》课程，讲述深度学习理论中的神经网络训练动力学

专知会员服务

59+阅读 · 2021年1月29日