Despite progress in adversarial training (AT), there is a substantial gap between the top-performing and worst-performing classes in many datasets. For example, on CIFAR10, the accuracies for the best and worst classes are 74% and 23%, respectively. We argue that this gap can be reduced by explicitly optimizing for the worst-performing class, resulting in a min-max-max optimization formulation. Our method, called class focused online learning (CFOL), includes high probability convergence guarantees for the worst class loss and can be easily integrated into existing training setups with minimal computational overhead. We demonstrate an improvement to 32% in the worst class accuracy on CIFAR10, and we observe consistent behavior across CIFAR100 and STL10. Our study highlights the importance of moving beyond average accuracy, which is particularly important in safety-critical applications.
翻译:尽管对抗训练(AT)取得了进展,但在许多数据集中表现最佳类别与最差类别之间存在显著差距。以CIFAR10为例,最佳类别与最差类别的准确率分别为74%和23%。我们认为,通过显式优化最差类别可以缩小这一差距,从而形成一种最小-最大-最大优化框架。我们提出的方法称为类别聚焦在线学习(CFOL),该方法为最差类别损失提供了高概率收敛保证,并且能以极小的计算开销轻松集成到现有训练设置中。我们在CIFAR10上将最差类别准确率提升至32%,并在CIFAR100和STL10中观察到一致的行为。本研究表明,超越平均准确率的重要性,这对于安全关键型应用尤为重要。