Training models with robust group fairness properties is crucial in ethically sensitive application areas such as medical diagnosis. Despite the growing body of work aiming to minimise demographic bias in AI, this problem remains challenging. A key reason for this challenge is the fairness generalisation gap: High-capacity deep learning models can fit all training data nearly perfectly, and thus also exhibit perfect fairness during training. In this case, bias emerges only during testing when generalisation performance differs across subgroups. This motivates us to take a bi-level optimisation perspective on fair learning: Optimising the learning strategy based on validation fairness. Specifically, we consider the highly effective workflow of adapting pre-trained models to downstream medical imaging tasks using parameter-efficient fine-tuning (PEFT) techniques. There is a trade-off between updating more parameters, enabling a better fit to the task of interest vs. fewer parameters, potentially reducing the generalisation gap. To manage this tradeoff, we propose FairTune, a framework to optimise the choice of PEFT parameters with respect to fairness. We demonstrate empirically that FairTune leads to improved fairness on a range of medical imaging datasets. The code is available at https://github.com/Raman1121/FairTune
翻译:在医学诊断等伦理敏感的应用领域,训练具有稳健群体公平性的模型至关重要。尽管旨在减少AI中人口统计偏差的研究日益增多,但该问题仍具挑战性。其关键原因在于公平性泛化鸿沟:高容量深度学习模型几乎能完美拟合所有训练数据,因此在训练过程中也表现出完全公平性。此时,偏差仅在测试阶段因不同亚组间的泛化性能差异而显现。这促使我们从双层优化的视角审视公平学习:基于验证集的公平性优化学习策略。具体而言,我们聚焦于利用参数高效微调(PEFT)技术将预训练模型适配到下游医学影像任务的高效工作流。这里存在权衡:更新更多参数可更好地适配目标任务,而更新较少参数则可能缩小泛化鸿沟。为平衡这一权衡,我们提出FairTune框架,该框架基于公平性优化PEFT参数的选择。实验表明,FairTune能在多个医学影像数据集上提升公平性。代码公开于https://github.com/Raman1121/FairTune。