Few-shot instance segmentation extends the few-shot learning paradigm to the instance segmentation task, which tries to segment instance objects from a query image with a few annotated examples of novel categories. Conventional approaches have attempted to address the task via prototype learning, known as point estimation. However, this mechanism is susceptible to noise and suffers from bias due to a significant scarcity of data. To overcome the disadvantages of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and $K$-shot information. Inspired by augmentation approaches that perturb data with Gaussian noise for populating low data density regions, we model the mask distribution with a diffusion probabilistic model. In addition, we propose to utilize classifier-free guided mask sampling to integrate category information into the binary mask generation process. Without bells and whistles, our proposed method consistently outperforms state-of-the-art methods on both base and novel classes of the COCO dataset while simultaneously being more stable than existing methods.
翻译:少样本实例分割将少样本学习范式扩展至实例分割任务,旨在利用少量标注的新类别示例从查询图像中分割出实例对象。传统方法通过原型学习(即点估计)尝试解决该任务,但此类机制易受噪声干扰,且因数据极度匮乏而产生偏差。为克服点估计机制的局限性,我们提出名为MaskDiff的新型方法,该方法对二值掩码的条件分布进行建模(以对象区域和$K$样本信息为条件)。受数据增强方法(通过高斯噪声扰动数据以填充低密度数据区域)启发,我们采用扩散概率模型对掩码分布进行建模。此外,我们提出利用无分类器引导的掩码采样技术,将类别信息整合至二值掩码生成过程中。无需额外复杂设计,我们提出的方法在COCO数据集的基础类和新类别上均持续优于现有最先进方法,同时展现出比现有方法更优越的稳定性。