Existing work on fairness modeling commonly assumes that sensitive attributes for all instances are fully available, which may not be true in many real-world applications due to the high cost of acquiring sensitive information. When sensitive attributes are not disclosed or available, it is needed to manually annotate a small part of the training data to mitigate bias. However, the skewed distribution across different sensitive groups preserves the skewness of the original dataset in the annotated subset, which leads to non-optimal bias mitigation. To tackle this challenge, we propose Active Penalization Of Discrimination (APOD), an interactive framework to guide the limited annotations towards maximally eliminating the effect of algorithmic bias. The proposed APOD integrates discrimination penalization with active instance selection to efficiently utilize the limited annotation budget, and it is theoretically proved to be capable of bounding the algorithmic bias. According to the evaluation on five benchmark datasets, APOD outperforms the state-of-the-arts baseline methods under the limited annotation budget, and shows comparable performance to fully annotated bias mitigation, which demonstrates that APOD could benefit real-world applications when sensitive information is limited.
翻译:现有公平性建模工作通常假设所有实例的敏感属性完全可用,但在许多实际应用中,由于获取敏感信息的高昂成本,这一假设往往不成立。当敏感属性未公开或不可用时,需要手动标注少量训练数据以缓解偏见。然而,不同敏感群体间的偏态分布使得标注子集保留了原始数据集的偏斜性,导致非最优的偏见缓解效果。为解决这一挑战,我们提出主动歧视惩罚(APOD)框架,通过交互式引导有限标注资源以最大化消除算法偏见的影响。所提出的APOD将歧视惩罚与主动实例选择相结合,以高效利用有限的标注预算,并在理论上证明其能有效约束算法偏见。基于五个基准数据集的评估表明,APOD在有限标注预算下优于现有最优基线方法,且性能与完全标注的偏见缓解方法相当,证明了其在敏感信息受限时对实际应用的价值。