Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed in some downstream tasks, data augmentation may introduce an unfair impact on classifications. While it can improve the performance of some classes, it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to counteract this phenomenon, we propose a FAir Classification approach with a Two-player game (FACT). We first formulate the training of a classifier with data augmentation as a fair optimization problem, which can be further written as an adversarial two-player game. Following this formulation, we propose a novel multiplicative weight optimization algorithm, for which we theoretically prove that it can converge to a solution that is fair over classes. Interestingly, our formulation also reveals that this fairness issue over classes is not due to data augmentation only, but is in fact a general phenomenon. Our empirical experiments demonstrate that the performance of our learned classifiers is indeed more fairly distributed over classes in five datasets, with only limited impact on the average accuracy.
翻译:数据增强技术已被广泛应用,并在不同的机器学习任务中展现出其优势。然而,正如近期在某些下游任务中所观察到的,数据增强可能会对分类任务引入不公平的影响。尽管它能提升某些类别的性能,但实际上可能对其他类别产生不利影响,这在某些应用领域中可能引发问题。在本文中,为应对这一现象,我们提出了一种基于双人博弈的公平分类方法(FACT)。我们首先将使用数据增强的分类器训练建模为一个公平优化问题,该问题可进一步表述为一种对抗性的双人博弈。基于此表述,我们提出了一种新颖的乘性权重优化算法,并从理论上证明了该算法能够收敛到一个在类别间公平的解。有趣的是,我们的表述还揭示了这种类别间的公平性问题并非仅由数据增强引起,实际上是一个普遍现象。我们的实证实验表明,在五个数据集上,我们所学得的分类器性能确实在类别间分布更为公平,且对平均准确率的影响有限。