The class imbalance problem in deep learning has been explored in several studies, but there has yet to be a systematic analysis of this phenomenon in object detection. Here, we present comprehensive analyses and experiments of the foreground-background (F-B) imbalance problem in object detection, which is very common and caused by small, infrequent objects of interest. We experimentally study the effects of different aspects of F-B imbalance (object size, number of objects, dataset size, object type) on detection performance. In addition, we also compare 9 leading methods for addressing this problem, including Faster-RCNN, SSD, OHEM, Libra-RCNN, Focal-Loss, GHM, PISA, YOLO-v3, and GFL with a range of datasets from different imaging domains. We conclude that (1) the F-B imbalance can indeed cause a significant drop in detection performance, (2) The detection performance is more affected by F-B imbalance when fewer training data are available, (3) in most cases, decreasing object size leads to larger performance drop than decreasing number of objects, given the same change in the ratio of object pixels to non-object pixels, (6) among all selected methods, Libra-RCNN and PISA demonstrate the best performance in addressing the issue of F-B imbalance. (7) When the training dataset size is large, the choice of method is not impactful (8) Soft-sampling methods, including focal-loss, GHM, and GFL, perform fairly well on average but are relatively unstable.
翻译:深度学习中的类别不平衡问题已在多项研究中被探讨,但针对目标检测中这一现象的系统性分析仍较为匮乏。本文对目标检测中普遍存在且由小尺寸、稀疏目标引发的前景-背景(F-B)不平衡问题进行了全面的分析与实验。我们通过实验研究了F-B不平衡的不同方面(目标尺寸、目标数量、数据集规模、目标类型)对检测性能的影响。此外,我们还比较了9种主流解决方法,包括Faster-RCNN、SSD、OHEM、Libra-RCNN、Focal-Loss、GHM、PISA、YOLO-v3和GFL,并采用了来自不同成像领域的多个数据集。我们得出结论:(1)F-B不平衡确实会导致检测性能显著下降;(2)当训练数据较少时,检测性能受F-B不平衡的影响更大;(3)在多数情况下,给定目标像素与非目标像素比例相同的变化,减小目标尺寸比减少目标数量导致的性能下降更显著;(6)在所选方法中,Libra-RCNN和PISA在解决F-B不平衡问题方面表现最佳;(7)当训练数据集规模较大时,方法的选择对性能影响不大;(8)软采样方法(包括focal-loss、GHM和GFL)平均表现良好,但相对不稳定。