Continual Vision-Language Representation Learning with Off-Diagonal Information - 专知论文

会员服务 ·

0

Continuity · INFORMS · Learning · 表示学习 · 表示 ·

2023 年 5 月 16 日

Continual Vision-Language Representation Learning with Off-Diagonal Information

翻译：持续视觉-语言表示学习中的非对角线信息

Zixuan Ni,Longhui Wei,Siliang Tang,Yueting Zhuang,Qi Tian

This paper discusses the feasibility of continuously training the CLIP model through streaming data. Then, by tracking the directional changes of the representation vectors in the continuously updated CLIP model, we explore and summarize these spatial variations as Spatial Disorder (SD), which can be divided into Intra-modal Rotation and Inter-modal Deviation. Moreover, we demonstrate how intra-modal rotation and inter-modal deviation lead to a performance decline for CLIP on cross-modal retrieval tasks in both empirically and theoretically. To alleviate the spatial disorder, we propose a simple yet effective continual learning framework Mod-X: \textbf{M}aintain \textbf{o}ff-\textbf{d}iagonal information-matri\textbf{X}. The experiments (in Section \ref{method}, \ref{experiments} and Appendix \ref{Appendix_to_experiments}) on commonly used datasets with different scales and scopes have illustrated the effectiveness of our method.

翻译：本文探讨了通过流数据持续训练CLIP模型的可行性。接着，通过追踪持续更新的CLIP模型中表示向量的方向变化，我们将这些空间变化探索并总结为空间无序（Spatial Disorder, SD），可进一步分为模态内旋转（Intra-modal Rotation）与模态间偏移（Inter-modal Deviation）。此外，我们从经验与理论两方面论证了模态内旋转与模态间偏移如何导致CLIP模型在跨模态检索任务中的性能下降。为缓解空间无序问题，我们提出了一种简单而有效的持续学习框架Mod-X：\textbf{维}护\textbf{非}对角线信息矩\textbf{阵}。在方法（第\ref{method}节）、实验（第\ref{experiments}节）及附录（附录\ref{Appendix_to_experiments}）中，针对不同规模与范围的常用数据集进行的实验验证了我们方法的有效性。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

带有通信量化和延时的多智能体系统一致性研究

国家自然科学基金

0+阅读 · 2014年12月31日

复合材料结构分析中的辛有限元分形研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖基化终末产物在糖尿病性角膜上皮病变中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AM真菌对锑在土壤-植物系统中迁移转化的作用机理

国家自然科学基金

0+阅读 · 2013年12月31日

移动互联网中基于历史行为的用户偏好在线学习机制

国家自然科学基金

1+阅读 · 2013年12月31日

碳纤维增强复合材料索锚固体系的耐久性及优化设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

轻金属硼基氢化物复合材料的制备及储氢性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

多壁碳纳米管/硅橡胶复合材料的结构与动态力学性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

强震下高层建筑结构连续倒塌破坏机理与评价方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

VSA: Learning Varied-Size Window Attention in Vision Transformers

Arxiv

0+阅读 · 2023年7月3日

Unified Language Representation for Question Answering over Text, Tables, and Images

Arxiv

0+阅读 · 2023年6月29日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Arxiv

15+阅读 · 2022年11月29日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

15+阅读 · 2022年5月3日

VLP: A Survey on Vision-Language Pre-training

VLP: A Survey on Vision-Language Pre-training

Arxiv

11+阅读 · 2022年2月21日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

专知会员服务

0+阅读 · 53分钟前

GNN跨域综述：从消息传递到图基础模型

GNN跨域综述：从消息传递到图基础模型

专知会员服务

0+阅读 · 55分钟前

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

11+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

3+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

3+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

2+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

7+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

6+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

10+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

8+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GNN跨域综述：从消息传递到图基础模型

巡飞弹与反无人机系统——现代战场的两大支柱

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

无人机自主控制与人工智能：系统性综述

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

VSA: Learning Varied-Size Window Attention in Vision Transformers

Arxiv

0+阅读 · 2023年7月3日

Unified Language Representation for Question Answering over Text, Tables, and Images

Arxiv

0+阅读 · 2023年6月29日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Learning Imbalanced Data with Vision Transformers

Arxiv

11+阅读 · 2023年3月8日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Arxiv

15+阅读 · 2022年11月29日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

15+阅读 · 2022年5月3日

VLP: A Survey on Vision-Language Pre-training

VLP: A Survey on Vision-Language Pre-training

Arxiv

11+阅读 · 2022年2月21日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

相关基金

带有通信量化和延时的多智能体系统一致性研究

国家自然科学基金

0+阅读 · 2014年12月31日

复合材料结构分析中的辛有限元分形研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖基化终末产物在糖尿病性角膜上皮病变中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AM真菌对锑在土壤-植物系统中迁移转化的作用机理

国家自然科学基金

0+阅读 · 2013年12月31日

移动互联网中基于历史行为的用户偏好在线学习机制

国家自然科学基金

1+阅读 · 2013年12月31日

碳纤维增强复合材料索锚固体系的耐久性及优化设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

轻金属硼基氢化物复合材料的制备及储氢性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

多壁碳纳米管/硅橡胶复合材料的结构与动态力学性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

强震下高层建筑结构连续倒塌破坏机理与评价方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员