Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak supervision has not been studied from the point of view of fairness. We begin such a study, starting with the observation that even when a fair model can be built from a dataset with access to ground-truth labels, the corresponding dataset labeled via weak supervision can be arbitrarily unfair. To address this, we propose and empirically validate a model for source unfairness in weak supervision, then introduce a simple counterfactual fairness-based technique that can mitigate these biases. Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness -- in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32\% while reducing demographic parity gap by 82.5\%. A simple extension of our method aimed at maximizing performance produces state-of-the-art performance in five out of ten datasets in the WRENCH benchmark.
翻译:弱监督通过减少对真实标签的需求,实现了训练集的高效开发。然而,使弱监督具有吸引力的技术——例如整合任意信号源来估计未知标签——也带来了生成的伪标签高度偏差的风险。令人惊讶的是,考虑到日常使用及其潜在的偏差加剧效应,弱监督尚未从公平性角度得到研究。我们开启此项研究,首先观察到:即便能够利用包含真实标签的数据集构建公平模型,但通过弱监督标注的相应数据集可能任意地不公平。为解决这一问题,我们提出并实证验证了弱监督中来源不公平性的模型,随后引入一种基于反事实公平性的简单技术来缓解这些偏差。理论上,我们证明该方法可同时提升准确性与公平性——这与通常面临权衡的标准公平性方法形成对比。实验表明,我们的技术使弱监督基线的准确率提升高达32%,同时将人口统计平等差距降低82.5%。为最大化性能而设计的简单扩展方法,在WRENCH基准测试的十个数据集中有五个达到了最先进水平。