Recent attention in instance segmentation has focused on query-based models. Despite being non-maximum suppression (NMS)-free and end-to-end, the superiority of these models on high-accuracy real-time benchmarks has not been well demonstrated. In this paper, we show the strong potential of query-based models on efficient instance segmentation algorithm designs. We present FastInst, a simple, effective query-based framework for real-time instance segmentation. FastInst can execute at a real-time speed (i.e., 32.5 FPS) while yielding an AP of more than 40 (i.e., 40.5 AP) on COCO test-dev without bells and whistles. Specifically, FastInst follows the meta-architecture of recently introduced Mask2Former. Its key designs include instance activation-guided queries, dual-path update strategy, and ground truth mask-guided learning, which enable us to use lighter pixel decoders, fewer Transformer decoder layers, while achieving better performance. The experiments show that FastInst outperforms most state-of-the-art real-time counterparts, including strong fully convolutional baselines, in both speed and accuracy. Code can be found at https://github.com/junjiehe96/FastInst .
翻译:近期实例分割领域的研究焦点集中在查询模型上。尽管此类模型无需非极大值抑制(NMS)且为端到端架构,但它们在高精度实时基准测试中的优越性尚未得到充分验证。本文展示了查询模型在高效实例分割算法设计中的强大潜力,并提出FastInst——一个简单、有效的基于查询的实时实例分割框架。FastInst在无需特殊技巧的情况下即可在COCO test-dev数据集上以实时速度(即32.5 FPS)运行,同时获得超过40的AP值(即40.5 AP)。具体而言,FastInst遵循近期提出的Mask2Former的元架构。其关键设计包括:实例激活引导查询、双路径更新策略以及真值掩码引导学习,这些设计使我们能够使用更轻量的像素解码器、更少的Transformer解码器层,同时实现更优性能。实验表明,FastInst在速度和精度上均优于大多数最先进的实时同类方法——包括强大的全卷积基线模型。代码见https://github.com/junjiehe96/FastInst。