APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization using CLIP - 专知论文

会员服务 ·

0

视觉注意 · 提示学习 · 视觉注意力 · 泛化 · 参数化 ·

2023 年 4 月 12 日

APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization using CLIP

翻译：APPLeNet：面向少样本遥感图像泛化的视觉注意力参数化提示学习（基于CLIP）

Mainak Singha,Ankit Jha,Bhupendra Solanki,Shirsha Bose,Biplab Banerjee

from arxiv, 11 Pages, 6 figures, 8 tables, Accepted in Earth Vision (CVPR 2023)

In recent years, the success of large-scale vision-language models (VLMs) such as CLIP has led to their increased usage in various computer vision tasks. These models enable zero-shot inference through carefully crafted instructional text prompts without task-specific supervision. However, the potential of VLMs for generalization tasks in remote sensing (RS) has not been fully realized. To address this research gap, we propose a novel image-conditioned prompt learning strategy called the Visual Attention Parameterized Prompts Learning Network (APPLeNet). APPLeNet emphasizes the importance of multi-scale feature learning in RS scene classification and disentangles visual style and content primitives for domain generalization tasks. To achieve this, APPLeNet combines visual content features obtained from different layers of the vision encoder and style properties obtained from feature statistics of domain-specific batches. An attention-driven injection module is further introduced to generate visual tokens from this information. We also introduce an anti-correlation regularizer to ensure discrimination among the token embeddings, as this visual information is combined with the textual tokens. To validate APPLeNet, we curated four available RS benchmarks and introduced experimental protocols and datasets for three domain generalization tasks. Our results consistently outperform the relevant literature and code is available at https://github.com/mainaksingha01/APPLeNet

翻译：近年来，大规模视觉语言模型（VLM）如CLIP的成功，推动了其在各类计算机视觉任务中的广泛应用。这类模型通过精心设计的指令性文本提示，无需任务特定监督即可实现零样本推理。然而，VLM在遥感图像泛化任务中的潜力尚未被充分发掘。为填补这一研究空白，我们提出了一种新颖的图像条件化提示学习策略——视觉注意力参数化提示学习网络（APPLeNet）。APPLeNet强调多尺度特征学习在遥感场景分类中的重要性，并分离了视觉风格与内容基元以应对域泛化任务。具体而言，APPLeNet融合了从视觉编码器不同层提取的视觉内容特征，以及从域特定批次特征统计中获得的风格属性，并进一步引入注意力驱动的注入模块，由此生成视觉标记。此外，我们提出一种反相关正则化项，以确保视觉信息与文本标记结合时，标记嵌入具有判别性。为验证APPLeNet的有效性，我们整理了四个公开遥感基准数据集，并针对三个域泛化任务设计了实验协议与数据集。实验结果持续优于相关文献，代码已开源至https://github.com/mainaksingha01/APPLeNet。

0

相关内容

视觉注意

ACL2022开会了！DeepMind学者等《视觉语言预训练:当前趋势与未来》教程，阐述最新前沿技术，附Slides

ACL2022开会了！DeepMind学者等《视觉语言预训练:当前趋势与未来》教程，阐述最新前沿技术，附Slides

专知会员服务

50+阅读 · 2022年5月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

【AAAI2022】跨域少样本图分类

【AAAI2022】跨域少样本图分类

专知会员服务

30+阅读 · 2022年1月22日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

CVPR 2022 | 元学习在图像回归任务的表现

CVPR 2022 | 元学习在图像回归任务的表现

PaperWeekly

1+阅读 · 2022年6月11日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

Underlay频谱共享方式下信号参数估计和调制识别的方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

混杂固定结构控制器设计: 一种基于系统增广的框架

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAR和可见光图像的脉冲耦合神经网络分层感知融合研究

国家自然科学基金

0+阅读 · 2013年12月31日

多源遥感图像识别关键技术研究

国家自然科学基金

7+阅读 · 2012年12月31日

深度学习理论及在图像识别中的应用研究

国家自然科学基金

6+阅读 · 2012年12月31日

银屑病中皮肤DC的免疫调节机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于鲁棒相似性测度的含噪图像分割的谱聚类方法

国家自然科学基金

0+阅读 · 2012年12月31日

面向英汉双向跨语言图像检索的文本分析关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation

Arxiv

0+阅读 · 2023年5月26日

CNN Feature Map Augmentation for Single-Source Domain Generalization

Arxiv

0+阅读 · 2023年5月26日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

VIP会员

文章信息

相关主题

视觉注意力

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

4+阅读 · 今天2:42

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

1+阅读 · 今天2:37

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

5+阅读 · 今天2:23

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

6+阅读 · 今天2:21

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

2+阅读 · 今天1:46

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

6+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

4+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

4+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

4+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

2+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

11+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

8+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

8+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

5+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

8+阅读 · 7月31日

相关VIP内容

ACL2022开会了！DeepMind学者等《视觉语言预训练:当前趋势与未来》教程，阐述最新前沿技术，附Slides

ACL2022开会了！DeepMind学者等《视觉语言预训练:当前趋势与未来》教程，阐述最新前沿技术，附Slides

专知会员服务

50+阅读 · 2022年5月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

【AAAI2022】跨域少样本图分类

【AAAI2022】跨域少样本图分类

专知会员服务

30+阅读 · 2022年1月22日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

CVPR 2022 | 元学习在图像回归任务的表现

CVPR 2022 | 元学习在图像回归任务的表现

PaperWeekly

1+阅读 · 2022年6月11日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

相关论文

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation

Arxiv

0+阅读 · 2023年5月26日

CNN Feature Map Augmentation for Single-Source Domain Generalization

Arxiv

0+阅读 · 2023年5月26日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

A Decade Survey of Content Based Image Retrieval using Deep Learning

Arxiv

23+阅读 · 2020年11月23日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

相关基金

Underlay频谱共享方式下信号参数估计和调制识别的方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

混杂固定结构控制器设计: 一种基于系统增广的框架

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAR和可见光图像的脉冲耦合神经网络分层感知融合研究

国家自然科学基金

0+阅读 · 2013年12月31日

多源遥感图像识别关键技术研究

国家自然科学基金

7+阅读 · 2012年12月31日

深度学习理论及在图像识别中的应用研究

国家自然科学基金

6+阅读 · 2012年12月31日

银屑病中皮肤DC的免疫调节机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于鲁棒相似性测度的含噪图像分割的谱聚类方法

国家自然科学基金

0+阅读 · 2012年12月31日

面向英汉双向跨语言图像检索的文本分析关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员