Instance segmentation is an advanced form of image segmentation which, beyond traditional segmentation, requires identifying individual instances of repeating objects in a scene. Mask R-CNN is the most common architecture for instance segmentation, and improvements to this architecture include steps such as benefiting from bounding box refinements, adding semantics, or backbone enhancements. In all the proposed variations to date, the problem of competing kernels (each class aims to maximize its own accuracy) persists when models try to synchronously learn numerous classes. In this paper, we propose mitigating this problem by replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors. We name the method MaskUno and test it on various models from the literature, which are then trained on multiple classes using the benchmark COCO dataset. An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes. MaskUno proved to enhance the mAP of instance segmentation models regardless of the number and typ
翻译:实例分割是图像分割的高级形式,其在传统分割基础上,还需识别场景中重复对象的独立实例。Mask R-CNN是实例分割中最常用的架构,针对该架构的改进包括利用边界框优化、添加语义信息或增强主干网络等步骤。在迄今提出的所有变体中,当模型尝试同步学习大量类别时,始终存在核竞争问题(每个类别都试图最大化自身精度)。本文提出通过使用Switch-Split模块替代掩码预测来缓解该问题,该模块处理优化后的感兴趣区域,对其进行分类并分配给专用掩码预测器。我们将该方法命名为MaskUno,并在文献中的多种模型上进行测试,这些模型使用基准COCO数据集进行多类别训练。在80个类别上训练时,高性能模型DetectoRS的平均精度均值(mAP)提升了2.03%。实验证明,无论类别数量与类型如何,MaskUno均能有效提升实例分割模型的mAP指标。