Fully unsupervised segmentation pipelines naively seek the most salient object, should this be present. As a result, most of the methods reported in the literature deliver non-deterministic partitions that are sensitive to initialization, seed order, and threshold heuristics. We propose PANC, a weakly supervised spectral segmentation framework that uses a minimal set of annotated visual tokens to produce stable, controllable, and reproducible object masks. From the TokenCut approach, we augment the token-token affinity graph with a handful of priors coupled to anchor nodes. By manipulating the graph topology, we bias the spectral eigenspace toward partitions that are consistent with the annotations. Our approach preserves the global grouping enforced by dense self-supervised visual features, trading annotated tokens for significant gains in reproducibility, user control, and segmentation quality. Using 5 to 30 annotations per dataset, our training-free method achieves state-of-the-art performance among weakly and unsupervised approaches on standard benchmarks (e.g., DUTS-TE, ECSSD, MS COCO). Contrarily, it excels in domains where dense labels are costly or intra-class differences are subtle. We report strong and reliable results on homogeneous, fine-grained, and texture-limited domains, achieving 96.8% (+14.43% over SotA), 78.0% (+0.2%), and 78.8% (+0.37%) average mean intersection-over-union (mIoU) on CrackForest (CFD), CUB-200-2011, and HAM10000 datasets, respectively. For multi-object benchmarks, the framework showcases explicit, user-controllable semantic segmentation.
翻译:完全无监督的分割流程会朴素地寻找最显著的对象(若其存在)。因此,文献中报道的大多数方法会产生非确定性的分割结果,这些结果对初始化、种子顺序和阈值启发式方法敏感。我们提出了PANC,一种弱监督的谱分割框架,它使用一组最小化的标注视觉标记来生成稳定、可控且可复现的对象掩码。基于TokenCut方法,我们通过少量与锚节点耦合的先验来增强标记-标记亲和力图。通过操纵图拓扑结构,我们将谱特征空间偏向于与标注一致的分割。我们的方法保留了由密集自监督视觉特征强制的全局分组,用标注标记换取可复现性、用户控制力和分割质量的显著提升。在每个数据集上使用5到30个标注,我们的免训练方法在标准基准测试(例如DUTS-TE、ECSSD、MS COCO)上实现了弱监督和无监督方法中最先进的性能。相反,它在密集标注成本高昂或类内差异细微的领域中表现出色。我们在同质、细粒度和纹理受限的领域中报告了强大且可靠的结果,在CrackForest(CFD)、CUB-200-2011和HAM10000数据集上分别实现了96.8%(相比SotA提升+14.43%)、78.0%(+0.2%)和78.8%(+0.37%)的平均均交并比(mIoU)。对于多对象基准测试,该框架展示了显式的、用户可控的语义分割能力。