Limited by the expensive labeling, polyp segmentation models are plagued by data shortages. To tackle this, we propose the mixed supervised polyp segmentation paradigm (MixPolyp). Unlike traditional models relying on a single type of annotation, MixPolyp combines diverse annotation types (mask, box, and scribble) within a single model, thereby expanding the range of available data and reducing labeling costs. To achieve this, MixPolyp introduces three novel supervision losses to handle various annotations: Subspace Projection loss (L_SP), Binary Minimum Entropy loss (L_BME), and Linear Regularization loss (L_LR). For box annotations, L_SP eliminates shape inconsistencies between the prediction and the supervision. For scribble annotations, L_BME provides supervision for unlabeled pixels through minimum entropy constraint, thereby alleviating supervision sparsity. Furthermore, L_LR provides dense supervision by enforcing consistency among the predictions, thus reducing the non-uniqueness. These losses are independent of the model structure, making them generally applicable. They are used only during training, adding no computational cost during inference. Extensive experiments on five datasets demonstrate MixPolyp's effectiveness.
翻译:受限于昂贵的标注成本,息肉分割模型常面临数据短缺的困境。为解决此问题,本文提出混合监督息肉分割范式(MixPolyp)。与传统模型依赖单一标注类型不同,MixPolyp在单一模型中融合了多种标注类型(掩码、边界框和涂鸦),从而扩展可用数据范围并降低标注成本。为实现这一目标,MixPolyp引入了三种新颖的监督损失函数以处理不同标注:子空间投影损失(L_SP)、二元最小熵损失(L_BME)和线性正则化损失(L_LR)。对于边界框标注,L_SP通过消除预测与监督之间的形状不一致性实现精准约束;对于涂鸦标注,L_BME通过最小熵约束为未标注像素提供监督,从而缓解监督稀疏性问题。此外,L_LR通过强制预测间的一致性提供密集监督,以此降低解的非唯一性。这些损失函数独立于模型结构,具有普适适用性,且仅在训练阶段使用,不会在推理过程中增加计算开销。在五个数据集上的大量实验验证了MixPolyp的有效性。