Understanding Sparse Neural Networks from their Topology via Multipartite Graph Representations

Pruning-at-Initialization (PaI) algorithms provide Sparse Neural Networks (SNNs) which are computationally more efficient than their dense counterparts, and try to avoid performance degradation. While much emphasis has been directed towards \emph{how} to prune, we still do not know \emph{what topological metrics} of the SNNs characterize \emph{good performance}. From prior work, we have layer-wise topological metrics by which SNN performance can be predicted: the Ramanujan-based metrics. To exploit these metrics, proper ways to represent network layers via Graph Encodings (GEs) are needed, with Bipartite Graph Encodings (BGEs) being the \emph{de-facto} standard at the current stage. Nevertheless, existing BGEs neglect the impact of the inputs, and do not characterize the SNN in an end-to-end manner. Additionally, thanks to a thorough study of the Ramanujan-based metrics, we discover that they are only as good as the \emph{layer-wise density} as performance predictors, when paired with BGEs. To close both gaps, we design a comprehensive topological analysis for SNNs with both linear and convolutional layers, via (i) a new input-aware Multipartite Graph Encoding (MGE) for SNNs and (ii) the design of new end-to-end topological metrics over the MGE. With these novelties, we show the following: (a) The proposed MGE allows to extract topological metrics that are much better predictors of the accuracy drop than metrics computed from current input-agnostic BGEs; (b) Which metrics are important at different sparsity levels and for different architectures; (c) A mixture of our topological metrics can rank PaI algorithms more effectively than Ramanujan-based metrics. The codebase is publicly available at https://github.com/eliacunegatti/mge-snn.

翻译：初始化剪枝算法生成的稀疏神经网络在计算效率上优于其稠密对应物，并试图避免性能退化。尽管研究重点多集中于"如何剪枝"，我们仍不清楚稀疏神经网络中哪些拓扑指标能表征"良好性能"。既有研究表明，基于Ramanujan的层间拓扑指标可预测稀疏神经网络性能。为利用这些指标，需要借助图编码对网络层进行恰当表示，而二分图编码目前已成为事实标准。然而，现有二分图编码忽略了输入的影响，且无法以端到端方式表征稀疏神经网络。此外，通过对Ramanujan指标的系统研究，我们发现其与二分图编码结合使用时，对性能的预测能力并不优于"层密度"指标。为填补这两项空白，我们设计了一种面向含线性层与卷积层的稀疏神经网络的综合拓扑分析方法，具体包括：(i)提出新型输入感知的多部分图编码；(ii)在编码上构建新型端到端拓扑指标。基于这些创新，我们证实：(a)相较于现有输入无关的二分图编码，所提多部分图编码提取的拓扑指标能更准确预测精度下降；(b)不同稀疏层级与不同架构下，各类指标的重要性存在差异；(c)相较于Ramanujan指标，拓扑指标的混合能更有效排序初始化剪枝算法。代码库已开源发布于https://github.com/eliacunegatti/mge-snn。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日