The state of neural network pruning has been noticed to be unclear and even confusing for a while, largely due to "a lack of standardized benchmarks and metrics" [3]. To standardize benchmarks, first, we need to answer: what kind of comparison setup is considered fair? This basic yet crucial question has barely been clarified in the community, unfortunately. Meanwhile, we observe several papers have used (severely) sub-optimal hyper-parameters in pruning experiments, while the reason behind them is also elusive. These sub-optimal hyper-parameters further exacerbate the distorted benchmarks, rendering the state of neural network pruning even more obscure. Two mysteries in pruning represent such a confusing status: the performance-boosting effect of a larger finetuning learning rate, and the no-value argument of inheriting pretrained weights in filter pruning. In this work, we attempt to explain the confusing state of network pruning by demystifying the two mysteries. Specifically, (1) we first clarify the fairness principle in pruning experiments and summarize the widely-used comparison setups; (2) then we unveil the two pruning mysteries and point out the central role of network trainability, which has not been well recognized so far; (3) finally, we conclude the paper and give some concrete suggestions regarding how to calibrate the pruning benchmarks in the future. Code: https://github.com/mingsun-tse/why-the-state-of-pruning-so-confusing.
翻译:神经网络剪枝领域的研究现状长期以来被认为不够清晰甚至令人困惑,这主要源于"缺乏标准化的基准测试与评估指标"[3]。要建立标准化基准,首先需要回答:何种对比设置可被视为公平?遗憾的是,这个基础却至关重要的问题在学界几乎从未得到明确阐述。与此同时,我们观察到多篇论文在剪枝实验中使用了(严重)欠优的超参数,而其原因同样难以捉摸。这些欠优超参数进一步加剧了基准测试的扭曲,使得神经网络剪枝的研究现状更加晦暗难明。剪枝领域的两个谜题正体现了这种令人困惑的现状:更大的微调学习率带来的性能提升效应,以及滤波器剪枝中继承预训练权重无效论。本研究试图通过揭示这两个谜题来解释网络剪枝的混乱状态。具体而言:(1)我们首先阐明剪枝实验中的公平性原则,并总结被广泛采用的对比设置;(2)进而揭示这两个剪枝谜题,指出网络可训练性的核心作用——这一关键因素至今尚未得到充分认识;(3)最后总结全文,并就未来如何校准剪枝基准提出具体建议。代码:https://github.com/mingsun-tse/why-the-state-of-pruning-so-confusing