Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or storage capacities. Although sparse models achieve performance comparable to that of their dense counterparts at the level of the entire dataset, they exhibit high accuracy drops for some data sub-groups. Existing methods to mitigate this disparate impact induced by pruning (i) rely on surrogate metrics that address the problem indirectly and have limited interpretability; or (ii) scale poorly with the number of protected sub-groups in terms of computational cost. We propose a constrained optimization approach that directly addresses the disparate impact of pruning: our formulation bounds the accuracy change between the dense and sparse models, for each sub-group. This choice of constraints provides an interpretable success criterion to determine if a pruned model achieves acceptable disparity levels. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups.
翻译:模型剪枝是一种通用方法,用于将大型深度学习模型部署到计算或存储能力受限的边缘设备上。尽管稀疏模型在整个数据集层面能达到与密集模型相当的性能,但它们在某些数据子组上表现出较高的准确率下降。现有的缓解剪枝导致的差异性影响方法要么(i)依赖间接解决该问题且可解释性有限的替代指标,要么(ii)在计算成本上随受保护子组数量增加而扩展性差。我们提出一种直接应对剪枝差异性影响的约束优化方法:我们的公式为每个子组限定了密集模型与稀疏模型之间的准确率变化。这种约束选择提供了可解释的成功标准,用于判断剪枝后的模型是否达到可接受的差异水平。实验结果表明,我们的技术能可靠地扩展到涉及大型模型和数百个受保护子组的问题。