Faster and Accurate Neural Networks with Semantic Inference

Deep neural networks (DNN) usually come with a significant computational burden. While approaches such as structured pruning and mobile-specific DNNs have been proposed, they incur drastic accuracy loss. In this paper we leverage the intrinsic redundancy in latent representations to reduce the computational load with limited loss in performance. We show that semantically similar inputs share many filters, especially in the earlier layers. Thus, semantically similar classes can be clustered to create cluster-specific subgraphs. To this end, we propose a new framework called Semantic Inference (SINF). In short, SINF (i) identifies the semantic cluster the object belongs to using a small additional classifier and (ii) executes the subgraph extracted from the base DNN related to that semantic cluster for inference. To extract each cluster-specific subgraph, we propose a new approach named Discriminative Capability Score (DCS) that finds the subgraph with the capability to discriminate among the members of a specific semantic cluster. DCS is independent from SINF and can be applied to any DNN. We benchmark the performance of DCS on the VGG16, VGG19, and ResNet50 DNNs trained on the CIFAR100 dataset against 6 state-of-the-art pruning approaches. Our results show that (i) SINF reduces the inference time of VGG19, VGG16, and ResNet50 respectively by up to 35%, 29% and 15% with only 0.17%, 3.75%, and 6.75% accuracy loss (ii) DCS achieves respectively up to 3.65%, 4.25%, and 2.36% better accuracy with VGG16, VGG19, and ResNet50 with respect to existing discriminative scores (iii) when used as a pruning criterion, DCS achieves up to 8.13% accuracy gain with 5.82% less parameters than the existing state of the art work published at ICLR 2023 (iv) when considering per-cluster accuracy, SINF performs on average 5.73%, 8.38% and 6.36% better than the base VGG16, VGG19, and ResNet50.

翻译：深度神经网络（DNN）通常伴随显著的计算负担。尽管已有结构化剪枝和移动端专用DNN等方法被提出，但它们会导致大幅精度损失。本文利用潜在表征中的固有冗余性，以有限性能损失为代价降低计算负荷。我们证明语义相似的输入共享大量滤波器，尤其是在浅层网络中。因此，语义相似的类别可被聚类以生成特定聚类子图。为此，我们提出名为语义推理（SINF）的新框架。简言之，SINF通过：(i) 使用小型附加分类器识别对象所属的语义聚类；(ii) 执行从基础DNN中提取的与该语义聚类相关的子图进行推理。为提取每个聚类特定子图，我们提出名为判别能力得分（DCS）的新方法，该方法能发现具备判别特定语义聚类内成员能力的子图。DCS独立于SINF并可应用于任意DNN。我们在CIFAR100数据集上训练的VGG16、VGG19和ResNet50模型上，将DCS与6种前沿剪枝方法进行性能基准测试。结果表明：(i) SINF分别使VGG19、VGG16和ResNet50的推理时间最高减少35%、29%和15%，精度损失仅0.17%、3.75%和6.75%；(ii) 相较于现有判别性得分，DCS在VGG16、VGG19和ResNet50上分别提升3.65%、4.25%和2.36%的精度；(iii) 作为剪枝准则时，DCS较发表于ICLR 2023的现有最优工作，在参数减少5.82%的同时实现最高8.13%的精度增益；(iv) 在考虑每聚类精度时，SINF较基础VGG16、VGG19和ResNet50分别平均提升5.73%、8.38%和6.36%。