PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models

With the increased attention to model efficiency, post-training sparsity (PTS) has become more and more prevalent because of its effectiveness and efficiency. However, there remain questions on better practice of PTS algorithms and the sparsification ability of models, which hinders the further development of this area. Therefore, a benchmark to comprehensively investigate the issues above is urgently needed. In this paper, we propose the first comprehensive post-training sparsity benchmark called PTSBench towards algorithms and models. We benchmark 10+ PTS general-pluggable fine-grained techniques on 3 typical tasks using over 40 off-the-shelf model architectures. Through extensive experiments and analyses, we obtain valuable conclusions and provide several insights from both algorithms and model aspects. Our PTSBench can provide (1) new observations for a better understanding of the PTS algorithms, (2) in-depth and comprehensive evaluations for the sparsification ability of models, and (3) a well-structured and easy-integrate open-source framework. We hope this work will provide illuminating conclusions and advice for future studies of post-training sparsity methods and sparsification-friendly model design. The code for our PTSBench is released at \href{https://github.com/ModelTC/msbench}{https://github.com/ModelTC/msbench}.

翻译：随着对模型效率的关注日益增加，后训练稀疏化因其有效性和高效性而变得越来越普遍。然而，关于PTS算法的更好实践以及模型的稀疏化能力仍存在疑问，这阻碍了该领域的进一步发展。因此，迫切需要建立一个基准来全面研究上述问题。在本文中，我们提出了首个面向算法与模型的全面后训练稀疏化基准，称为PTSBench。我们使用超过40种现成的模型架构，在3个典型任务上对10多种通用可插拔的细粒度PTS技术进行了基准测试。通过大量的实验和分析，我们获得了有价值的结论，并从算法和模型两方面提供了若干见解。我们的PTSBench能够提供：（1）用于更好理解PTS算法的新观察结果；（2）对模型稀疏化能力的深入且全面的评估；（3）一个结构良好且易于集成的开源框架。我们希望这项工作能为后训练稀疏化方法以及稀疏化友好型模型设计的未来研究提供启发性的结论和建议。我们的PTSBench代码发布于 \href{https://github.com/ModelTC/msbench}{https://github.com/ModelTC/msbench}。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日