Existing work on fairness modeling commonly assumes that sensitive attributes for all instances are fully available, which may not be true in many real-world applications due to the high cost of acquiring sensitive information. When sensitive attributes are not disclosed or available, it is needed to manually annotate a small part of the training data to mitigate bias. However, the skewed distribution across different sensitive groups preserves the skewness of the original dataset in the annotated subset, which leads to non-optimal bias mitigation. To tackle this challenge, we propose Active Penalization Of Discrimination (APOD), an interactive framework to guide the limited annotations towards maximally eliminating the effect of algorithmic bias. The proposed APOD integrates discrimination penalization with active instance selection to efficiently utilize the limited annotation budget, and it is theoretically proved to be capable of bounding the algorithmic bias. According to the evaluation on five benchmark datasets, APOD outperforms the state-of-the-arts baseline methods under the limited annotation budget, and shows comparable performance to fully annotated bias mitigation, which demonstrates that APOD could benefit real-world applications when sensitive information is limited.
翻译:现有公平性建模研究通常假设所有实例的敏感属性完全可用,但现实中由于获取敏感信息的高昂成本,这一假设常不成立。当敏感属性未公开或不可获取时,需手动标注少量训练数据以缓解偏差。然而,不同敏感群体间的分布倾斜导致标注子集保留原始数据集的偏态特性,从而无法实现最优偏差缓解。为应对这一挑战,我们提出主动惩罚歧视(APOD)框架,通过交互式引导有限标注资源以最大化消除算法偏差效应。该框架将歧视惩罚与主动实例选择相结合,高效利用有限标注预算,并从理论上证明其能约束算法偏差。基于五个基准数据集的评估显示,APOD在有限标注预算下优于现有基准方法,且性能与全标注偏差缓解方法相当,这表明APOD在敏感信息受限的现实应用场景中具有显著价值。