The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to generate data-dependent permutation matrices for shuffling. Since the dimension of permutation matrix is proportional to the square of the number of input channels, to make the generation process efficiently, we divide the channels into groups and generate two shared small permutation matrices for each group, and utilize Kronecker product and cross group shuffle to obtain the final permutation matrices. To make the generation process learnable, based on theoretical analysis, softmax, orthogonal regularization, and binarization are employed to asymptotically approximate the permutation matrix. Dynamic shuffle adaptively mixes channel information with negligible extra computation and memory occupancy. Experiment results on image classification benchmark datasets CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet have shown that our method significantly increases ShuffleNets' performance. Adding dynamic generated matrix with learnable static matrix, we further propose static-dynamic-shuffle and show that it can serve as a lightweight replacement of ordinary pointwise convolution.
翻译:卷积神经网络的冗余性不仅依赖于权重,还依赖于输入。混洗是一种高效的通道信息混合操作,但其混洗顺序通常是预定义的。为了减少数据依赖的冗余性,我们设计了一种动态混洗模块,用于生成数据依赖的置换矩阵进行混洗。由于置换矩阵的维度与输入通道数的平方成正比,为了高效生成该矩阵,我们将通道划分为多个组,并为每组生成两个共享的小型置换矩阵,利用Kronecker积和跨组混洗获得最终的置换矩阵。为使生成过程可学习,基于理论分析,我们采用Softmax、正交正则化和二值化渐近逼近置换矩阵。动态混洗以可忽略的额外计算和内存占用自适应地混合通道信息。在图像分类基准数据集CIFAR-10、CIFAR-100、Tiny ImageNet和ImageNet上的实验结果表明,我们的方法显著提升了ShuffleNet的性能。通过将动态生成矩阵与可学习的静态矩阵结合,我们进一步提出了静态-动态混洗,并证明其可作为普通逐点卷积的轻量级替代方案。