Representation and decomposition of functions in DAG-DNNs and structural network pruning

The conclusions provided by deep neural networks (DNNs) must be carefully scrutinized to determine whether they are universal or architecture dependent. The term DAG-DNN refers to a graphical representation of a DNN in which the architecture is expressed as a direct-acyclic graph (DAG), on which arcs are associated with functions. The level of a node denotes the maximum number of hops between the input node and the node of interest. In the current study, we demonstrate that DAG-DNNs can be used to derive all functions defined on various sub-architectures of the DNN. We also demonstrate that the functions defined in a DAG-DNN can be derived via a sequence of lower-triangular matrices, each of which provides the transition of functions defined in sub-graphs up to nodes at a specified level. The lifting structure associated with lower-triangular matrices makes it possible to perform the structural pruning of a network in a systematic manner. The fact that decomposition is universally applicable to all DNNs means that network pruning could theoretically be applied to any DNN, regardless of the underlying architecture. We demonstrate that it is possible to obtain the winning ticket (sub-network and initialization) for a weak version of the lottery ticket hypothesis, based on the fact that the sub-network with initialization can achieve training performance on par with that of the original network using the same number of iterations or fewer.

翻译：深度神经网络（DNN）提供的结论必须经过仔细审视，以确定其是普适的还是依赖于特定架构。术语DAG-DNN指代DNN的一种图形化表示，其架构被表达为有向无环图（DAG），图中弧与函数相关联。节点的层级表示输入节点与目标节点之间的最大跳数。在本研究中，我们证明DAG-DNN可用于推导定义在DNN各种子架构上的所有函数。我们还证明，DAG-DNN中定义的函数可通过一系列下三角矩阵推导得出，每个矩阵提供了定义在子图中直至指定层级节点的函数迁移。与下三角矩阵关联的提升结构使得能够以系统化方式进行网络的结构化剪枝。由于这种分解普遍适用于所有DNN，这意味着网络剪枝在理论上可应用于任意DNN，无论其底层架构如何。我们证明，基于子网络在相同或更少迭代次数下能达到与原始网络相当训练性能的事实，可以获得弱化版彩票假设的中奖彩票（子网络与初始化参数）。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日