Bias mitigation of Language Models has been the topic of many studies with a recent focus on learning separate modules like adapters for on-demand debiasing. Besides optimizing for a modularized debiased model, it is often critical in practice to control the degree of bias reduction at inference time, e.g., in order to tune for a desired performance-fairness trade-off in search results or to control the strength of debiasing in classification tasks. In this paper, we introduce Controllable Gate Adapter (ConGater), a novel modular gating mechanism with adjustable sensitivity parameters, which allows for a gradual transition from the biased state of the model to the fully debiased version at inference time. We demonstrate ConGater performance by (1) conducting adversarial debiasing experiments with three different models on three classification tasks with four protected attributes, and (2) reducing the bias of search results through fairness list-wise regularization to enable adjusting a trade-off between performance and fairness metrics. Our experiments on the classification tasks show that compared to baselines of the same caliber, ConGater can maintain higher task performance while containing less information regarding the attributes. Our results on the retrieval task show that the fully debiased ConGater can achieve the same fairness performance while maintaining more than twice as high task performance than recent strong baselines. Overall, besides strong performance ConGater enables the continuous transitioning between biased and debiased states of models, enhancing personalization of use and interpretability through controllability.
翻译:语言模型的偏差缓解一直是许多研究的主题,近期研究重点在于学习独立的模块(如适配器)以实现按需去偏差。除了优化模块化的去偏差模型外,在实践中,在推理阶段控制偏差降低的程度通常至关重要,例如,为了在搜索结果中调整期望的性能-公平性权衡,或在分类任务中控制去偏差的强度。本文提出了可控门控适配器(ConGater),一种新颖的模块化门控机制,具有可调的灵敏度参数,允许在推理阶段从模型的偏差状态逐渐过渡到完全去偏差版本。我们通过以下方式展示了ConGater的性能:(1)在三个分类任务中使用四种受保护属性对三个不同模型进行对抗性去偏差实验,以及(2)通过公平性列表级正则化减少搜索结果的偏差,从而能够调整性能与公平性指标之间的权衡。我们的分类任务实验表明,与同等级别的基线相比,ConGater能够在维持更高任务性能的同时,包含更少的属性相关信息。我们的检索任务结果显示,完全去偏差的ConGater在实现相同公平性性能的同时,其任务性能比近期强基线高出两倍以上。总体而言,除了强大的性能外,ConGater还实现了模型在偏差与去偏差状态之间的连续转换,通过可控性增强了使用的个性化和可解释性。