This paper presents a Spatial Re-parameterization (SpRe) method for the N:M sparsity in CNNs. SpRe is stemmed from an observation regarding the restricted variety in spatial sparsity present in N:M sparsity compared with unstructured sparsity. Particularly, N:M sparsity exhibits a fixed sparsity rate within the spatial domains due to its distinctive pattern that mandates N non-zero components among M successive weights in the input channel dimension of convolution filters. On the contrary, we observe that unstructured sparsity displays a substantial divergence in sparsity across the spatial domains, which we experimentally verified to be very crucial for its robust performance retention compared with N:M sparsity. Therefore, SpRe employs the spatial-sparsity distribution of unstructured sparsity to assign an extra branch in conjunction with the original N:M branch at training time, which allows the N:M sparse network to sustain a similar distribution of spatial sparsity with unstructured sparsity. During inference, the extra branch can be further re-parameterized into the main N:M branch, without exerting any distortion on the sparse pattern or additional computation costs. SpRe has achieved a commendable feat by matching the performance of N:M sparsity methods with state-of-the-art unstructured sparsity methods across various benchmarks. Code and models are anonymously available at \url{https://github.com/zyxxmu/SpRe}.
翻译:本文提出了一种用于卷积神经网络中N:M稀疏性的空间重参数化方法。该方法源于对N:M稀疏性与非结构化稀疏性相比在空间稀疏性多样性受限的观察。具体而言,由于N:M稀疏性在卷积滤波器输入通道维度上要求每M个连续权重中必须包含N个非零元素的特殊模式,导致其在空间域内表现出固定的稀疏率。相反,我们观察到非结构化稀疏性在空间域间呈现出显著的稀疏度差异,并通过实验验证这种差异对其保持鲁棒性能至关重要——这正是N:M稀疏性所欠缺的。因此,空间重参数化方法利用非结构化稀疏性的空间稀疏分布特性,在训练阶段为原始N:M分支附加一个辅助分支,使得N:M稀疏网络能够维持与非结构化稀疏性相似的空间稀疏分布。在推理阶段,该辅助分支可进一步重参数化到主N:M分支中,且不会改变稀疏模式或产生额外计算开销。空间重参数化方法在多个基准测试中取得了显著成果,使N:M稀疏方法的性能达到了与非结构化稀疏前沿方法相当的水平。代码与模型已匿名发布于\url{https://github.com/zyxxmu/SpRe}。