Compressing neural networks by quantizing model parameters offers useful trade-off between performance and efficiency. Methods like quantization-aware training and post-training quantization strive to maintain the downstream performance of compressed models compared to the full precision models. However, these techniques do not explicitly consider the impact on algorithmic fairness. In this work, we study fairness-aware mixed-precision quantization schemes for medical image classification under explicit bit budgets. We introduce FairQuant, a framework that combines group-aware importance analysis, budgeted mixed-precision allocation, and a learnable Bit-Aware Quantization (BAQ) mode that jointly optimizes weights and per-unit bit allocations under bitrate and fairness regularization. We evaluate the method on Fitzpatrick17k and ISIC2019 across ResNet18/50, DeiT-Tiny, and TinyViT. Results show that FairQuant configurations with average precision near 4-6 bits recover much of the Uniform 8-bit accuracy while improving worst-group performance relative to Uniform 4- and 8-bit baselines, with comparable fairness metrics under shared budgets.
翻译:通过量化模型参数来压缩神经网络,可在性能与效率之间实现有益的权衡。诸如量化感知训练与训练后量化等方法致力于保持压缩模型相较于全精度模型的下游性能。然而,这些技术并未明确考虑对算法公平性的影响。在本工作中,我们研究了在明确比特预算下,面向医学图像分类的公平感知混合精度量化方案。我们提出了FairQuant框架,该框架结合了组感知重要性分析、预算约束的混合精度分配,以及一种可学习的比特感知量化(BAQ)模式,该模式在比特率与公平性正则化的约束下,联合优化权重与每单元的比特分配。我们在Fitzpatrick17k和ISIC2019数据集上,基于ResNet18/50、DeiT-Tiny和TinyViT模型对该方法进行了评估。结果表明,平均精度在4-6比特附近的FairQuant配置,在相对于均匀4比特和8比特基线提升最差组性能的同时,恢复了均匀8比特配置的大部分准确率,且在相同预算下获得了可比的公平性指标。