Zero-guidance Segmentation Using Zero Segment Labels - 专知论文

会员服务 ·

0

标注 · Guidance · 语义相似度 · HTTPS · 查准率/准确率 ·

2023 年 3 月 24 日

Zero-guidance Segmentation Using Zero Segment Labels

翻译：零引导分割：零分割标签下的语义分割

Pitchaporn Rewatbowornwong,Nattanat Chatthee,Ekapol Chuangsuwanich,Supasorn Suwajanakorn

CLIP has enabled new and exciting joint vision-language applications, one of which is open-vocabulary segmentation, which can locate any segment given an arbitrary text query. In our research, we ask whether it is possible to discover semantic segments without any user guidance in the form of text queries or predefined classes, and label them using natural language automatically? We propose a novel problem zero-guidance segmentation and the first baseline that leverages two pre-trained generalist models, DINO and CLIP, to solve this problem without any fine-tuning or segmentation dataset. The general idea is to first segment an image into small over-segments, encode them into CLIP's visual-language space, translate them into text labels, and merge semantically similar segments together. The key challenge, however, is how to encode a visual segment into a segment-specific embedding that balances global and local context information, both useful for recognition. Our main contribution is a novel attention-masking technique that balances the two contexts by analyzing the attention layers inside CLIP. We also introduce several metrics for the evaluation of this new task. With CLIP's innate knowledge, our method can precisely locate the Mona Lisa painting among a museum crowd. Project page: https://zero-guide-seg.github.io/.

翻译：CLIP实现了新颖且激动人心的视觉-语言联合应用，其中之一是开放词汇分割技术，它能够根据任意文本查询定位任意分割区域。本研究提出一个关键问题：能否在不依赖任何用户引导（如文本查询或预定义类别）的情况下自动发现语义分割区域，并用自然语言为其标注？我们提出了一个全新的问题——零引导分割，并构建了首个基线方法，该方法利用两个预训练的通用模型DINO和CLIP，无需微调或分割数据集即可解决此问题。总体思路是：首先将图像分割为精细的超分割区域，将其编码至CLIP的视觉-语言空间，转化为文本标签，并合并语义相似的区域。然而核心挑战在于如何将视觉区域编码为兼顾全局与局部上下文信息的区域特定嵌入——这两种信息对识别均至关重要。我们的主要贡献是提出一种新型注意力遮蔽技术，通过分析CLIP内部的注意力层来平衡两类上下文信息。针对这一新任务，我们还引入多项评估指标。凭借CLIP的固有知识，本方法能在博物馆人群中精确定位《蒙娜丽莎》画作。项目页面：https://zero-guide-seg.github.io/。

0

相关内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【NeurIPS 2021 】 K-Net-大统一图像分割任务：语义、实例乃至全景分割

【NeurIPS 2021 】 K-Net-大统一图像分割任务：语义、实例乃至全景分割

专知会员服务

21+阅读 · 2021年12月14日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于区分性模型学习的综合在线多物体检测、跟踪和分割

国家自然科学基金

1+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

高分辨率极化SAR图像场景分类研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超图形XGML的图像半结构化研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

图在曲面上嵌入的分类

国家自然科学基金

0+阅读 · 2011年12月31日

基于氟标签的纳米界面组装和分子诊断技术

国家自然科学基金

0+阅读 · 2009年12月31日

Web图像的语义表示及在聚类与排序中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

高分辨率极化SAR图像场景分割与标注算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2023年5月15日

SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

Arxiv

0+阅读 · 2023年5月15日

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Arxiv

0+阅读 · 2023年5月13日

Music Rearrangement Using Hierarchical Segmentation

Arxiv

0+阅读 · 2023年5月12日

Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning

Arxiv

1+阅读 · 2023年5月11日

Self-Supervised Instance Segmentation by Grasping

Arxiv

0+阅读 · 2023年5月10日

Image Segmentation Using Deep Learning: A Survey

Arxiv

17+阅读 · 2020年11月15日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

语义相似度

查准率/准确率

最新内容

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

6+阅读 · 今天8:00

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

5+阅读 · 今天7:44

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

4+阅读 · 今天7:28

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

4+阅读 · 今天7:18

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

5+阅读 · 今天7:07

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

4+阅读 · 今天7:03

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

6+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

10+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

4+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

8+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

7+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

4+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

相关VIP内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【NeurIPS 2021 】 K-Net-大统一图像分割任务：语义、实例乃至全景分割

【NeurIPS 2021 】 K-Net-大统一图像分割任务：语义、实例乃至全景分割

专知会员服务

21+阅读 · 2021年12月14日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

热门VIP内容

开通专知VIP会员享更多权益服务

重新思考无人机时代的生存能力

在人工智能加速决策环境中拓展OODA循环

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

装甲突击旅：现代战争思考、战斗与组织

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Masked Collaborative Contrast for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2023年5月15日

SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

Arxiv

0+阅读 · 2023年5月15日

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Arxiv

0+阅读 · 2023年5月13日

Music Rearrangement Using Hierarchical Segmentation

Arxiv

0+阅读 · 2023年5月12日

Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning

Arxiv

1+阅读 · 2023年5月11日

Self-Supervised Instance Segmentation by Grasping

Arxiv

0+阅读 · 2023年5月10日

Image Segmentation Using Deep Learning: A Survey

Arxiv

17+阅读 · 2020年11月15日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于区分性模型学习的综合在线多物体检测、跟踪和分割

国家自然科学基金

1+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

高分辨率极化SAR图像场景分类研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超图形XGML的图像半结构化研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

图在曲面上嵌入的分类

国家自然科学基金

0+阅读 · 2011年12月31日

基于氟标签的纳米界面组装和分子诊断技术

国家自然科学基金

0+阅读 · 2009年12月31日

Web图像的语义表示及在聚类与排序中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

高分辨率极化SAR图像场景分割与标注算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员