In the field of Industrial Informatics, interactive segmentation has gained significant attention for its application in human-computer interaction and data annotation. Existing algorithms, however, face challenges in balancing the segmentation accuracy between large and small targets, often leading to an increased number of user interactions. To tackle this, a novel multi-scale token adaptation algorithm, leveraging token similarity, has been devised to enhance segmentation across varying target sizes. This algorithm utilizes a differentiable top-k tokens selection mechanism, allowing for fewer tokens to be used while maintaining efficient multi-scale token interaction. Furthermore, a contrastive loss is introduced to better discriminate between target and background tokens, improving the correctness and robustness of the tokens similar to the target. Extensive benchmarking shows that the algorithm achieves state-of-the-art (SOTA) performance compared to current methods. An interactive demo and all reproducible codes will be released at https://github.com/hahamyt/mst.
翻译:在工业信息学领域,交互式分割因其在人机交互与数据标注中的应用而备受关注。然而,现有算法在平衡大目标与小目标的分割精度方面面临挑战,常导致用户交互次数增加。为解决此问题,本文提出了一种基于标记相似性的新型多尺度标记自适应算法,以增强对不同尺寸目标的分割能力。该算法利用可微分top-k标记选择机制,在保持高效多尺度标记交互的同时,可使用更少的标记。此外,引入对比损失以更好地区分目标标记与背景标记,从而提升与目标相似标记的正确性与鲁棒性。广泛的基准测试表明,该算法相较于现有方法取得了最先进的性能。交互式演示及所有可复现代码将在https://github.com/hahamyt/mst发布。