Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or storage capacities. Although sparse models achieve performance comparable to that of their dense counterparts at the level of the entire dataset, they exhibit high accuracy drops for some data sub-groups. Existing methods to mitigate this disparate impact induced by pruning (i) rely on surrogate metrics that address the problem indirectly and have limited interpretability; or (ii) scale poorly with the number of protected sub-groups in terms of computational cost. We propose a constrained optimization approach that $\textit{directly addresses the disparate impact of pruning}$: our formulation bounds the accuracy change between the dense and sparse models, for each sub-group. This choice of constraints provides an interpretable success criterion to determine if a pruned model achieves acceptable disparity levels. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups.
翻译:模型剪枝是一种常见方法,旨在将大型深度学习模型部署到计算或存储能力受限的边缘设备上。尽管稀疏模型在整个数据集层面上的性能与密集模型相当,但它们在部分数据子组上表现出较高的精度下降。现有缓解剪枝导致的这种差异性影响的方法存在以下问题:(i)依赖间接处理该问题且可解释性有限的替代指标;或(ii)随着受保护子组数量增加,计算成本扩展性差。我们提出一种约束优化方法,$\textit{直接应对剪枝的差异性影响}$:我们的公式针对每个子组,限定了密集模型与稀疏模型之间的精度变化。这种约束选择提供了可解释的成功准则,用于判断剪枝模型是否达到可接受的差异水平。实验结果表明,我们的技术能够可靠地扩展到涉及大型模型和数百个受保护子组的问题。