We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP), that are used for labeling tasks. We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning. We categorize desired behaviors based around three axes: (i) if the task concerns humans; (ii) how subjective the task is (i.e., how likely it is that people from a diverse range of backgrounds would agree on a labeling); and (iii) the intended purpose of the task and if fairness is better served by impartiality (i.e., making decisions independent of the protected attributes) or representation (i.e., making decisions to maximize diversity). Finally, we provide quantitative fairness evaluations for both binary-valued and multi-valued protected attributes over ten diverse datasets. We find that fair PCA, a post-processing method for fair representations, works very well for debiasing in most of the aforementioned tasks while incurring only minor loss of performance. However, different debiasing approaches vary in their effectiveness depending on the task. Hence, one should choose the debiasing approach depending on the specific use case.
翻译:我们提出了一种针对判别式基础模型(如用于标注任务的对比语言-图像预训练模型CLIP)偏差评估的新型分类体系。基于该分类体系,我们系统评估了现有缓解这些模型偏差的方法。具体而言,我们评估了OpenAI的CLIP和OpenCLIP模型在关键应用场景(如零样本分类、图像检索和图像描述生成)中的表现。我们将期望行为围绕三个维度进行分类:(i)任务是否涉及人类;(ii)任务的主观程度(即来自不同背景的人群对标注达成一致的可能性);(iii)任务的预期目的,以及公平性是通过 impartiality(即不依赖受保护属性做决策)还是 representation(即通过最大化多样性做决策)来实现。最后,我们对十个不同数据集中涉及的二元及多元受保护属性进行了定量公平性评估。研究发现,公平主成分分析(fair PCA)作为面向公平表征的后处理方法,能在绝大多数前述任务中有效实现去偏差,且仅带来轻微性能损失。然而,不同去偏差方法的有效性因任务而异。因此,应根据具体应用场景选择合适的去偏差方法。