Explainability-Inspired Layer-Wise Pruning of Deep Neural Networks for Efficient Object Detection

Deep neural networks (DNNs) have achieved remarkable success in object detection tasks, but their increasing complexity poses significant challenges for deployment on resource-constrained platforms. While model compression techniques such as pruning have emerged as essential tools, traditional magnitude-based pruning methods do not necessarily align with the true functional contribution of network components to task-specific performance. In this work, we present an explainability-inspired, layer-wise pruning framework tailored for efficient object detection. Our approach leverages a SHAP-inspired gradient--activation attribution to estimate layer importance, providing a data-driven proxy for functional contribution rather than relying solely on static weight magnitudes. We conduct comprehensive experiments across diverse object detection architectures, including ResNet-50, MobileNetV2, ShuffleNetV2, Faster R-CNN, RetinaNet, and YOLOv8, evaluating performance on the Microsoft COCO 2017 validation set. The results show that the proposed attribution-inspired pruning consistently identifies different layers as least important compared to L1-norm-based methods, leading to improved accuracy--efficiency trade-offs. Notably, for ShuffleNetV2, our method yields a 10\% empirical increase in inference speed, whereas L1-pruning degrades performance by 13.7\%. For RetinaNet, the proposed approach preserves the baseline mAP (0.151) with negligible impact on inference speed, while L1-pruning incurs a 1.3\% mAP drop for a 6.2\% speed increase. These findings highlight the importance of data-driven layer importance assessment and demonstrate that explainability-inspired compression offers a principled direction for deploying deep neural networks on edge and resource-constrained platforms while preserving both performance and interpretability.

翻译：深度神经网络（DNNs）在目标检测任务中取得了显著成功，但其日益增长的复杂性对在资源受限平台上的部署构成了重大挑战。尽管剪枝等模型压缩技术已成为重要工具，但传统的基于权重幅度的剪枝方法未必与网络组件对任务特定性能的真实功能贡献相一致。在本工作中，我们提出了一种受可解释性启发的、专为高效目标检测设计的逐层剪枝框架。我们的方法利用一种受SHAP启发的梯度-激活归因来估计层的重要性，为功能贡献提供了一个数据驱动的代理指标，而非仅仅依赖于静态的权重幅度。我们在多种目标检测架构上进行了全面的实验，包括ResNet-50、MobileNetV2、ShuffleNetV2、Faster R-CNN、RetinaNet和YOLOv8，并在Microsoft COCO 2017验证集上评估性能。结果表明，与基于L1范数的方法相比，所提出的归因启发式剪枝方法始终将不同的层识别为最不重要的，从而实现了更好的精度-效率权衡。值得注意的是，对于ShuffleNetV2，我们的方法使推理速度经验性地提高了10%，而L1剪枝则导致性能下降13.7%。对于RetinaNet，所提出的方法在推理速度影响可忽略不计的情况下，保持了基线mAP（0.151），而L1剪枝在速度提升6.2%的同时导致了1.3%的mAP下降。这些发现凸显了数据驱动的层重要性评估的重要性，并表明受可解释性启发的压缩为在边缘和资源受限平台上部署深度神经网络提供了一条原则性的方向，同时保持了性能和可解释性。