To solve ever more complex problems, Deep Neural Networks are scaled to billions of parameters, leading to huge computational costs. An effective approach to reduce computational requirements and increase efficiency is to prune unnecessary components of these often over-parameterized networks. Previous work has shown that attribution methods from the field of eXplainable AI serve as effective means to extract and prune the least relevant network components in a few-shot fashion. We extend the current state by proposing to explicitly optimize hyperparameters of attribution methods for the task of pruning, and further include transformer-based networks in our analysis. Our approach yields higher model compression rates of large transformer- and convolutional architectures (VGG, ResNet, ViT) compared to previous works, while still attaining high performance on ImageNet classification tasks. Here, our experiments indicate that transformers have a higher degree of over-parameterization compared to convolutional neural networks. Code is available at https://github.com/erfanhatefi/Pruning-by-eXplaining-in-PyTorch.
翻译:为解决日益复杂的问题,深度神经网络被扩展至数十亿参数,导致巨大的计算成本。降低计算需求并提升效率的一种有效方法是剪裁这些通常过度参数化网络中不必要的组件。先前研究表明,可解释人工智能领域的归因方法可作为以少量样本方式提取并剪裁最不相关网络组件的有效手段。我们通过提出为剪枝任务显式优化归因方法的超参数,并将基于Transformer的网络纳入分析,从而扩展了当前研究边界。相较于已有工作,我们的方法在大型Transformer和卷积架构(VGG、ResNet、ViT)上实现了更高的模型压缩率,同时仍在ImageNet分类任务上保持高性能。实验表明,与卷积神经网络相比,Transformer具有更高程度的过度参数化。代码发布于https://github.com/erfanhatefi/Pruning-by-eXplaining-in-PyTorch。