Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak supervision has not been studied from the point of view of fairness. We begin such a study, starting with the observation that even when a fair model can be built from a dataset with access to ground-truth labels, the corresponding dataset labeled via weak supervision can be arbitrarily unfair. To address this, we propose and empirically validate a model for source unfairness in weak supervision, then introduce a simple counterfactual fairness-based technique that can mitigate these biases. Theoretically, we show that it is possible for our approach to simultaneously improve both accuracy and fairness -- in contrast to standard fairness approaches that suffer from tradeoffs. Empirically, we show that our technique improves accuracy on weak supervision baselines by as much as 32\% while reducing demographic parity gap by 82.5\%. A simple extension of our method aimed at maximizing performance produces state-of-the-art performance in five out of ten datasets in the WRENCH benchmark.
翻译:弱监督通过减少对真实标签的需求,实现了训练集的高效开发。然而,使弱监督具有吸引力的技术——例如整合任何信号源以估计未知标签——也带来了所产生的伪标签高度偏倚的危险。令人惊讶的是,考虑到日常使用以及偏倚增加的可能性,弱监督尚未从公平性的角度得到研究。我们开始了这样的研究,首先观察到:即使可以从具有真实标签的数据集中构建公平模型,通过弱监督标注的相应数据集也可能任意不公。为解决这一问题,我们提出并实证验证了一个弱监督中的源不公平性模型,随后引入了一种基于反事实公平性的简单技术来缓解这些偏见。理论上,我们证明该方法能够同时提升准确性和公平性——这与存在权衡取舍的标准公平性方法形成对比。实验表明,我们的技术在弱监督基线上将准确率提升多达32%,同时将人口统计平等差距降低82.5%。为最大化性能而对方法进行的简单扩展,在WRENCH基准测试的十个数据集中有五个达到了最先进性能。