The conclusions provided by deep neural networks (DNNs) must be carefully scrutinized to determine whether they are universal or architecture dependent. The term DAG-DNN refers to a graphical representation of a DNN in which the architecture is expressed as a direct-acyclic graph (DAG), on which arcs are associated with functions. The level of a node denotes the maximum number of hops between the input node and the node of interest. In the current study, we demonstrate that DAG-DNNs can be used to derive all functions defined on various sub-architectures of the DNN. We also demonstrate that the functions defined in a DAG-DNN can be derived via a sequence of lower-triangular matrices, each of which provides the transition of functions defined in sub-graphs up to nodes at a specified level. The lifting structure associated with lower-triangular matrices makes it possible to perform the structural pruning of a network in a systematic manner. The fact that decomposition is universally applicable to all DNNs means that network pruning could theoretically be applied to any DNN, regardless of the underlying architecture. We demonstrate that it is possible to obtain the winning ticket (sub-network and initialization) for a weak version of the lottery ticket hypothesis, based on the fact that the sub-network with initialization can achieve training performance on par with that of the original network using the same number of iterations or fewer.
翻译:深度神经网络(DNN)提供的结论必须经过仔细审视,以确定其是普适的还是依赖于特定架构。术语DAG-DNN指代DNN的一种图形化表示,其架构被表达为有向无环图(DAG),图中弧与函数相关联。节点的层级表示输入节点与目标节点之间的最大跳数。在本研究中,我们证明DAG-DNN可用于推导定义在DNN各种子架构上的所有函数。我们还证明,DAG-DNN中定义的函数可通过一系列下三角矩阵推导得出,每个矩阵提供了定义在子图中直至指定层级节点的函数迁移。与下三角矩阵关联的提升结构使得能够以系统化方式进行网络的结构化剪枝。由于这种分解普遍适用于所有DNN,这意味着网络剪枝在理论上可应用于任意DNN,无论其底层架构如何。我们证明,基于子网络在相同或更少迭代次数下能达到与原始网络相当训练性能的事实,可以获得弱化版彩票假设的中奖彩票(子网络与初始化参数)。