Where Should Knowledge Enter? A Layered Framework for Knowledge Infusion in Multimodal Iterative Generative Model - 专知论文

会员服务 ·

0

知识 (knowledge) · MoDELS · 层 · 多峰值 · Processing（编程语言） ·

Where Should Knowledge Enter? A Layered Framework for Knowledge Infusion in Multimodal Iterative Generative Model

翻译：暂无翻译

Renjith Prasad,Chathurangi Shyalika,Anushka Pawar,Aahan Rathod,Amit Sheth

Multimodal generative models produce fluent outputs but remain unreliable when generation must respect structured, domain-specific, or safety-critical knowledge. Existing methods incorporate knowledge through mechanisms such as prompt augmentation, guidance, latent editing, or fine-tuning, yet they are typically categorized by technique rather than by the component of the generative process they modify. We argue that knowledge infusion in iterative generative models is fundamentally anintervention-layer problem. Since thegenerative process unfolds as a trajectory of internal states, knowledge can act on four structurally distinct components of this process: the input/output boundary, the transition function, the intermediate state, and the model parameters. This maps to four intervention layers: surface, trajectory, latent, and parametric infusion. We instantiate the framework in diffusion models, map representative methods to all four layers, and derive design principles for multi-layer composition. In a controlled safety-alignment experiment using a multimodal knowledge graph with two diffusion backbones, we implement three of the four layers cumulatively, surface (input-side and output-side) and trajectory--latent (mid-generation). We show empirically that each additional layer addresses failure classes that prior layers cannot reach, reducing knowledge-violating outputs by 70.97% compared to vanilla generation and empirically confirming the framework's complementarity prediction.

翻译：暂无翻译

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

多模态认知计算

多模态认知计算

专知会员服务

182+阅读 · 2022年9月16日

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【翻译-ACL2020】使用知识库嵌入改进知识图上的多跳问答

【翻译-ACL2020】使用知识库嵌入改进知识图上的多跳问答

专知会员服务

70+阅读 · 2020年7月3日

【论文推荐】多模态知识图谱上的端到端实体分类，End-to-End Entity Classification on Multimodal Knowledge Graphs

【论文推荐】多模态知识图谱上的端到端实体分类，End-to-End Entity Classification on Multimodal Knowledge Graphs

专知会员服务

50+阅读 · 2020年3月30日

【论文】知识图嵌入的群表示理论（Group Representation Theory for Knowledge Graph Embedding），俄亥俄州立大学| Chen Cai

【论文】知识图嵌入的群表示理论（Group Representation Theory for Knowledge Graph Embedding），俄亥俄州立大学| Chen Cai

专知会员服务

31+阅读 · 2019年12月30日

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

专知会员服务

24+阅读 · 2019年12月30日

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

专知会员服务

60+阅读 · 2019年11月25日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

300+篇文献！一文详解基于Transformer的多模态学习最新进展

300+篇文献！一文详解基于Transformer的多模态学习最新进展

PaperWeekly

13+阅读 · 2022年7月1日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

知识图谱中的深度学习技术应用概述

知识图谱中的深度学习技术应用概述

深度学习与NLP

11+阅读 · 2018年9月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

半监督多任务学习：Semisupervised Multitask Learning

半监督多任务学习：Semisupervised Multitask Learning

我爱读PAMI

18+阅读 · 2018年4月29日

【干货】谷歌一个模型解决所有问题《One Model to Learn Them All》论文深度解读

【干货】谷歌一个模型解决所有问题《One Model to Learn Them All》论文深度解读

专知

10+阅读 · 2018年1月14日

读书报告 | Deep Learning for Extreme Multi-label Text Classification

读书报告 | Deep Learning for Extreme Multi-label Text Classification

科技创新与创业

48+阅读 · 2018年1月10日

图文混合跨媒体知识单元的模糊分类方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向数万处理器的有限元线性方程组与模态多级算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

嵌入式异构多核系统应用程序自动并行化过程关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于极限学习单元的多生物特征图像深度学习建模与识别研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于Phase-type分布的多状态系统可靠性模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多域认知的空天信息网络智能拓扑构建机制基础研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于框架提升变换的多源图像融合研究

国家自然科学基金

2+阅读 · 2015年12月31日

城市知识流的表征及其结构演化的复杂性研究

国家自然科学基金

0+阅读 · 2014年12月31日

种群遗传学的多人交互式学习研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于结构化方法的复杂研发项目多领域集成分析与优化研究

国家自然科学基金

2+阅读 · 2014年12月31日

Public Diffusion Models, Private Images: Key-Controlled Inversion for Conditional Reconstruction

Arxiv

0+阅读 · 6月22日

Disentangling Intrinsic Importance from Emergent Structure in Multi-Expert Orchestration

Arxiv

0+阅读 · 6月21日

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

Arxiv

0+阅读 · 6月18日

Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification

Arxiv

0+阅读 · 6月17日

From Paper to Program: Externalizing and Diagnosing Knowledge Bottlenecks in AI-Assisted Quantum Many-Body Code Generation

Arxiv

0+阅读 · 6月17日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

Arxiv

48+阅读 · 2022年9月7日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

VIP会员

文章信息

相关主题

知识 (knowledge)

Processing（编程语言）

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

1+阅读 · 今天14:45

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

1+阅读 · 今天14:43

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

3+阅读 · 今天14:31

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 今天14:20

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

2+阅读 · 今天14:11

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

3+阅读 · 今天14:07

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

3+阅读 · 今天14:03

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

2+阅读 · 今天13:59

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

5+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

7+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

8+阅读 · 6月22日

相关VIP内容

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

多模态认知计算

多模态认知计算

专知会员服务

182+阅读 · 2022年9月16日

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【翻译-ACL2020】使用知识库嵌入改进知识图上的多跳问答

【翻译-ACL2020】使用知识库嵌入改进知识图上的多跳问答

专知会员服务

70+阅读 · 2020年7月3日

【论文推荐】多模态知识图谱上的端到端实体分类，End-to-End Entity Classification on Multimodal Knowledge Graphs

【论文推荐】多模态知识图谱上的端到端实体分类，End-to-End Entity Classification on Multimodal Knowledge Graphs

专知会员服务

50+阅读 · 2020年3月30日

【论文】知识图嵌入的群表示理论（Group Representation Theory for Knowledge Graph Embedding），俄亥俄州立大学| Chen Cai

【论文】知识图嵌入的群表示理论（Group Representation Theory for Knowledge Graph Embedding），俄亥俄州立大学| Chen Cai

专知会员服务

31+阅读 · 2019年12月30日

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

专知会员服务

24+阅读 · 2019年12月30日

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

【AAAI2020】知识图谱对齐网络（Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation），孙泽群，胡伟

专知会员服务

60+阅读 · 2019年11月25日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

300+篇文献！一文详解基于Transformer的多模态学习最新进展

300+篇文献！一文详解基于Transformer的多模态学习最新进展

PaperWeekly

13+阅读 · 2022年7月1日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

知识图谱中的深度学习技术应用概述

知识图谱中的深度学习技术应用概述

深度学习与NLP

11+阅读 · 2018年9月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

半监督多任务学习：Semisupervised Multitask Learning

半监督多任务学习：Semisupervised Multitask Learning

我爱读PAMI

18+阅读 · 2018年4月29日

【干货】谷歌一个模型解决所有问题《One Model to Learn Them All》论文深度解读

【干货】谷歌一个模型解决所有问题《One Model to Learn Them All》论文深度解读

专知

10+阅读 · 2018年1月14日

读书报告 | Deep Learning for Extreme Multi-label Text Classification

读书报告 | Deep Learning for Extreme Multi-label Text Classification

科技创新与创业

48+阅读 · 2018年1月10日

相关论文

Public Diffusion Models, Private Images: Key-Controlled Inversion for Conditional Reconstruction

Arxiv

0+阅读 · 6月22日

Disentangling Intrinsic Importance from Emergent Structure in Multi-Expert Orchestration

Arxiv

0+阅读 · 6月21日

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

Arxiv

0+阅读 · 6月18日

Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification

Arxiv

0+阅读 · 6月17日

From Paper to Program: Externalizing and Diagnosing Knowledge Bottlenecks in AI-Assisted Quantum Many-Body Code Generation

Arxiv

0+阅读 · 6月17日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

Arxiv

48+阅读 · 2022年9月7日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

相关基金

图文混合跨媒体知识单元的模糊分类方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向数万处理器的有限元线性方程组与模态多级算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

嵌入式异构多核系统应用程序自动并行化过程关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于极限学习单元的多生物特征图像深度学习建模与识别研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于Phase-type分布的多状态系统可靠性模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多域认知的空天信息网络智能拓扑构建机制基础研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于框架提升变换的多源图像融合研究

国家自然科学基金

2+阅读 · 2015年12月31日

城市知识流的表征及其结构演化的复杂性研究

国家自然科学基金

0+阅读 · 2014年12月31日

种群遗传学的多人交互式学习研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于结构化方法的复杂研发项目多领域集成分析与优化研究

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员