With growing size of Neural Networks (NNs), model sparsification to reduce the computational cost and memory demand for model inference has become of vital interest for both research and production. While many sparsification methods have been proposed and successfully applied on individual models, to the best of our knowledge their behavior and robustness has not yet been studied on large populations of models. With this paper, we address that gap by applying two popular sparsification methods on populations of models (so called model zoos) to create sparsified versions of the original zoos. We investigate the performance of these two methods for each zoo, compare sparsification layer-wise, and analyse agreement between original and sparsified populations. We find both methods to be very robust with magnitude pruning able outperform variational dropout with the exception of high sparsification ratios above 80%. Further, we find sparsified models agree to a high degree with their original non-sparsified counterpart, and that the performance of original and sparsified model is highly correlated. Finally, all models of the model zoos and their sparsified model twins are publicly available: modelzoos.cc.
翻译:随着神经网络(NN)规模的不断增长,通过模型稀疏化来降低推理过程中的计算成本和内存需求已成为研究与生产领域的关键课题。尽管已有多种稀疏化方法被提出并成功应用于单个模型,但据我们所知,这些方法在大规模模型群体上的行为表现与鲁棒性尚未得到充分研究。本文通过将两种主流稀疏化方法应用于模型群体(即所谓的模型动物园),创建原始动物园的稀疏化版本,从而填补这一研究空白。我们分别考察了这两种方法在每个动物园上的表现,按层级比较稀疏化效果,并分析原始群体与稀疏化群体之间的一致性。研究发现,这两种方法均表现出极强的鲁棒性,其中幅度剪枝方法在性能上优于变分丢弃法,但在稀疏化比例超过80%的高稀疏场景下例外。此外,稀疏化模型与其原始非稀疏对应模型具有高度一致性,且原始模型与稀疏化模型的性能呈高度相关。最后,所有模型动物园及其稀疏化孪生模型均已在modelzoos.cc平台公开发布。