We present ELSA, a practical solution for creating deep networks that can easily be deployed at different levels of sparsity. The core idea is to embed one or more sparse networks within a single dense network as a proper subset of the weights. At prediction time, any sparse model can be extracted effortlessly simply be zeroing out weights according to a predefined mask. ELSA is simple, powerful and highly flexible. It can use essentially any existing technique for network sparsification and network training. In particular, it does not restrict the loss function, architecture or the optimization technique. Our experiments show that ELSA's advantages of flexible deployment comes with no or just a negligible reduction in prediction quality compared to the standard way of using multiple sparse networks that are trained and stored independently.
翻译:我们提出ELSA,一种能够便捷部署不同稀疏度深度网络的实用解决方案。其核心思想是在单个密集网络中嵌入一个或多个稀疏网络作为权重的真子集。在推理阶段,只需根据预定义掩码将权重归零,即可轻松提取任意稀疏模型。ELSA方法简洁、强大且高度灵活,可兼容现有的绝大多数网络稀疏化与训练技术,特别是不限制损失函数、网络架构或优化方法。实验表明,与独立训练存储多个稀疏网络的标准方案相比,ELSA在实现灵活部署优势的同时,预测质量无损失或仅有可忽略的降低。