How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance

Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high average accuracy is accompanied by low accuracy in a minority group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, this paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance. Although our theoretical framework is centered on binary classification using a one-hidden-layer neural network, to the best of our knowledge, we provide the first theoretical analysis of the group-level generalization of ERM in addition to the commonly studied average generalization performance. Sample insights of our theoretical results include that when all group-level co-variance is in the medium regime and all mean are close to zero, the learning performance is most desirable in the sense of a small sample complexity, a fast training rate, and a high average and group-level testing accuracy. Moreover, we show that increasing the fraction of the minority group in the training data does not necessarily improve the generalization performance of the minority group. Our theoretical results are validated on both synthetic and empirical datasets, such as CelebA and CIFAR-10 in image classification.

翻译：组别不平衡一直是经验风险最小化（ERM）中的已知问题，表现为高平均准确率伴随少数类组别低准确率。尽管已有算法致力于提升少数类组别的准确率，但对ERM在单个组别上的泛化理论分析仍不充分。本文通过高斯混合模型对组别不平衡问题进行建模，量化了单个组别对样本复杂度、收敛速度以及平均和组别级测试性能的影响。尽管我们的理论框架聚焦于使用单隐层神经网络的二分类任务，但在已有普遍研究的平均泛化性能之外，我们首次提出了ERM在组别级泛化上的理论分析。理论结果的样本启示包括：当所有组别的协方差处于中等区间且均值接近零时，学习性能最为理想，具体表现为样本复杂度低、训练速度快、平均和组别级测试准确率高。此外，我们证明增加训练数据中少数类组别的比例并不必然提升该组别的泛化性能。我们的理论结果在合成数据集和真实数据集（如图像分类中的CelebA和CIFAR-10）上均得到了验证。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日