How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? - 专知论文

会员服务 ·

0

多峰值 · Learning · INFORMS · 样例 · 监督 ·

2024 年 11 月 12 日

How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning?

翻译：文本信息如何影响多模态上下文学习的检索效果？

Yang Luo,Zangwei Zheng,Zirui Zhu,Yang You

from arxiv, EMNLP 2024

The increase in parameter size of multimodal large language models (MLLMs) introduces significant capabilities, particularly in-context learning, where MLLMs enhance task performance without updating pre-trained parameters. This effectiveness, however, hinges on the appropriate selection of in-context examples, a process that is currently biased towards visual data, overlooking textual information. Furthermore, the area of supervised retrievers for MLLMs, crucial for optimal in-context example selection, continues to be uninvestigated. Our study offers an in-depth evaluation of the impact of textual information on the unsupervised selection of in-context examples in multimodal contexts, uncovering a notable sensitivity of retriever performance to the employed modalities. Responding to this, we introduce a novel supervised MLLM-retriever MSIER that employs a neural network to select examples that enhance multimodal in-context learning efficiency. This approach is validated through extensive testing across three distinct tasks, demonstrating the method's effectiveness. Additionally, we investigate the influence of modalities on our supervised retrieval method's training and pinpoint factors contributing to our model's success. This exploration paves the way for future advancements, highlighting the potential for refined in-context learning in MLLMs through the strategic use of multimodal data.

翻译：随着多模态大语言模型参数规模的扩大，其能力显著增强，特别是在上下文学习方面——MLLMs无需更新预训练参数即可提升任务性能。然而，这种效果取决于上下文示例的恰当选择，当前选择过程存在偏向视觉数据而忽视文本信息的问题。此外，针对MLLMs的监督检索器研究领域（对优化上下文示例选择至关重要）仍未被探索。本研究深入评估了文本信息在多模态环境下对无监督上下文示例选择的影响，发现检索器性能对所采用模态具有显著敏感性。为此，我们提出了一种新型监督式MLLM检索器MSIER，该模型利用神经网络选择能够提升多模态上下文学习效率的示例。通过在三个不同任务上的广泛测试验证了该方法的有效性。此外，我们探究了模态对监督检索方法训练的影响，并确定了模型成功的关键因素。这项探索为未来进展铺平了道路，凸显了通过多模态数据的策略性使用来优化MLLMs上下文学习的潜力。

0

相关内容

多峰值

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

JAK/STAT/SOCS3信号通路介导卒中后认知损害的分子机制及中医“活血通络”法对其的干预研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Continual Learning of Large Language Models: A Comprehensive Survey

Continual Learning of Large Language Models: A Comprehensive Survey

Arxiv

11+阅读 · 2024年4月25日

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Arxiv

17+阅读 · 2023年12月19日

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Arxiv

10+阅读 · 2023年11月9日

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances

Arxiv

13+阅读 · 2023年10月11日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

24+阅读 · 2023年9月3日

The Expressive Power of Graph Neural Networks: A Survey

Arxiv

10+阅读 · 2023年8月16日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

Introspective Tips: Large Language Model for In-Context Decision Making

Arxiv

12+阅读 · 2023年5月19日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

Arxiv

15+阅读 · 2019年12月4日

VIP会员

文章信息

相关主题

最新内容

大语言模型平台在国防情报应用中的对比

大语言模型平台在国防情报应用中的对比

专知会员服务

3+阅读 · 今天3:12

美陆军“增强任务分析”实验：将人工智能集成到军事决策流程中

美陆军“增强任务分析”实验：将人工智能集成到军事决策流程中

专知会员服务

4+阅读 · 今天3:00

《面向安全态势自适应决策的情报信息系统与机器学习算法研究》

《面向安全态势自适应决策的情报信息系统与机器学习算法研究》

专知会员服务

2+阅读 · 今天2:56

《杀伤链中人类判断的终结？论AI智能体对主动权与解释权的重置》

《杀伤链中人类判断的终结？论AI智能体对主动权与解释权的重置》

专知会员服务

2+阅读 · 今天2:44

《仿真互操作性标准：实时平台参考联邦对象模型指南、原理与互操作性模式标准》300页

《仿真互操作性标准：实时平台参考联邦对象模型指南、原理与互操作性模式标准》300页

专知会员服务

4+阅读 · 今天2:37

《自主远程巡飞弹药打击系统的嵌入式人工智能感知框架》

《自主远程巡飞弹药打击系统的嵌入式人工智能感知框架》

专知会员服务

3+阅读 · 今天2:22

美海军“超配项目”

美海军“超配项目”

专知会员服务

4+阅读 · 今天2:13

《美陆军条例：陆军指挥政策（2026版）》

《美陆军条例：陆军指挥政策（2026版）》

专知会员服务

10+阅读 · 4月21日

《提升美军全域城市作战训练最佳实践的案例研究》366页

《提升美军全域城市作战训练最佳实践的案例研究》366页

专知会员服务

13+阅读 · 4月21日

《军用自主人工智能系统的治理与安全》

《军用自主人工智能系统的治理与安全》

专知会员服务

7+阅读 · 4月21日

美海军数字作战负责人：如何利用数据快速生成战斗力

美海军数字作战负责人：如何利用数据快速生成战斗力

专知会员服务

8+阅读 · 4月21日

《COOL模型（行动循环圈）：军事领导体系中的战役层级变革流程》

《COOL模型（行动循环圈）：军事领导体系中的战役层级变革流程》

专知会员服务

11+阅读 · 4月20日

《系统簇式多域作战规划范畴论框架》

《系统簇式多域作战规划范畴论框架》

专知会员服务

10+阅读 · 4月20日

《美国防部指令6130.03，第2卷服役医疗标准：保留》

《美国防部指令6130.03，第2卷服役医疗标准：保留》

专知会员服务

6+阅读 · 4月20日

《美国防部指令6130.03，第1卷服役医疗标准：任命、征募或征召》

《美国防部指令6130.03，第1卷服役医疗标准：任命、征募或征召》

专知会员服务

4+阅读 · 4月20日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美陆军“增强任务分析”实验：将人工智能集成到军事决策流程中

《杀伤链中人类判断的终结？论AI智能体对主动权与解释权的重置》

大语言模型平台在国防情报应用中的对比

《面向安全态势自适应决策的情报信息系统与机器学习算法研究》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Continual Learning of Large Language Models: A Comprehensive Survey

Continual Learning of Large Language Models: A Comprehensive Survey

Arxiv

11+阅读 · 2024年4月25日

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Arxiv

17+阅读 · 2023年12月19日

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Arxiv

10+阅读 · 2023年11月9日

How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances

Arxiv

13+阅读 · 2023年10月11日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

24+阅读 · 2023年9月3日

The Expressive Power of Graph Neural Networks: A Survey

Arxiv

10+阅读 · 2023年8月16日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

Introspective Tips: Large Language Model for In-Context Decision Making

Arxiv

12+阅读 · 2023年5月19日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

Arxiv

15+阅读 · 2019年12月4日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

JAK/STAT/SOCS3信号通路介导卒中后认知损害的分子机制及中医“活血通络”法对其的干预研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员