Class imbalance poses a significant challenge to supervised classification, particularly in critical domains like medical diagnostics and anomaly detection where minority class instances are rare. While numerous studies have explored rebalancing techniques to address this issue, less attention has been given to evaluating the performance of binary classifiers under imbalance when no such techniques are applied. Therefore, the goal of this study is to assess the performance of binary classifiers "as-is", without performing any explicit rebalancing. Specifically, we systematically evaluate the robustness of a diverse set of binary classifiers across both real-world and synthetic datasets, under progressively reduced minority class sizes, using one-shot and few-shot scenarios as baselines. Our approach also explores varying data complexities through synthetic decision boundary generation to simulate real-world conditions. In addition to standard classifiers, we include experiments using undersampling, oversampling strategies, and one-class classification (OCC) methods to examine their behavior under severe imbalance. The results confirm that classification becomes more difficult as data complexity increases and the minority class size decreases. While traditional classifiers deteriorate under extreme imbalance, advanced models like TabPFN and boosting-based ensembles retain relatively higher performance and better generalization compared to traditional classifiers. Visual interpretability and evaluation metrics further validate these findings. Our work offers valuable guidance on model selection for imbalanced learning, providing insights into classifier robustness without dependence on explicit rebalancing techniques.
翻译:类别不平衡对监督分类构成了重大挑战,尤其是在医疗诊断和异常检测等少数类样本罕见的关键领域。尽管大量研究探索了再平衡技术以解决这一问题,但在未应用此类技术的情况下,评估二分类器在不平衡数据上的性能却较少受到关注。因此,本研究的目标是评估二分类器在"原样"状态下(即不执行任何显式再平衡)的性能。具体而言,我们系统评估了多种二分类器在真实世界和合成数据集上,随着少数类样本规模逐步缩减时的鲁棒性,并以单样本和少样本场景作为基线。我们的方法还通过合成决策边界生成来探索不同数据复杂度,以模拟真实世界条件。除标准分类器外,我们纳入了使用欠采样、过采样策略及单类分类方法的实验,以考察其在严重不平衡下的行为。结果表明,随着数据复杂度增加和少数类规模减小,分类难度显著上升。传统分类器在极端不平衡下性能恶化,而TabPFN与基于提升的集成模型等先进模型相较于传统分类器仍能保持较高的性能和更好的泛化能力。视觉可解释性与评估指标进一步验证了这些发现。本研究为不平衡学习中的模型选择提供了有价值的指导,揭示了分类器在不依赖显式再平衡技术情况下的鲁棒性见解。