Training models with robust group fairness properties is crucial in ethically sensitive application areas such as medical diagnosis. Despite the growing body of work aiming to minimise demographic bias in AI, this problem remains challenging. A key reason for this challenge is the fairness generalisation gap: High-capacity deep learning models can fit all training data nearly perfectly, and thus also exhibit perfect fairness during training. In this case, bias emerges only during testing when generalisation performance differs across subgroups. This motivates us to take a bi-level optimisation perspective on fair learning: Optimising the learning strategy based on validation fairness. Specifically, we consider the highly effective workflow of adapting pre-trained models to downstream medical imaging tasks using parameter-efficient fine-tuning (PEFT) techniques. There is a trade-off between updating more parameters, enabling a better fit to the task of interest vs. fewer parameters, potentially reducing the generalisation gap. To manage this tradeoff, we propose FairTune, a framework to optimise the choice of PEFT parameters with respect to fairness. We demonstrate empirically that FairTune leads to improved fairness on a range of medical imaging datasets.
翻译:摘要:在医学诊断等伦理敏感的应用领域,训练具有稳健群体公平性的模型至关重要。尽管旨在减少人工智能中人口统计偏差的研究日益增多,但该问题仍具挑战性。一个关键原因在于公平性泛化差距:高容量深度学习模型能近乎完美地拟合全部训练数据,因此在训练过程中也会表现出完美的公平性。在此情况下,偏差仅在测试阶段因不同子组的泛化性能差异而显现。这促使我们从双层优化视角审视公平学习:基于验证集的公平性优化学习策略。具体而言,我们关注利用参数高效微调技术将预训练模型适配至下游医学影像任务这一高效工作流。其中存在权衡:更新更多参数可更好地适配目标任务,而更新更少参数则可能缩小泛化差距。为管理此权衡,我们提出FairTune框架,通过优化参数高效微调的选择实现公平性目标。实验表明,FairTune在多个医学影像数据集上均能显著提升公平性。