Deep Video Inpainting Guided by Audio-Visual Self-Supervision - 专知论文

会员服务 ·

0

图像修复 · 知识 (knowledge) · Networking · 损失 · Learning ·

2023 年 10 月 11 日

Deep Video Inpainting Guided by Audio-Visual Self-Supervision

翻译：基于音视频自监督引导的深度视频修复

Kyuyeon Kim,Junsik Jung,Woo Jae Kim,Sung-Eui Yoon

from arxiv, Accepted at ICASSP 2022

Humans can easily imagine a scene from auditory information based on their prior knowledge of audio-visual events. In this paper, we mimic this innate human ability in deep learning models to improve the quality of video inpainting. To implement the prior knowledge, we first train the audio-visual network, which learns the correspondence between auditory and visual information. Then, the audio-visual network is employed as a guider that conveys the prior knowledge of audio-visual correspondence to the video inpainting network. This prior knowledge is transferred through our proposed two novel losses: audio-visual attention loss and audio-visual pseudo-class consistency loss. These two losses further improve the performance of the video inpainting by encouraging the inpainting result to have a high correspondence to its synchronized audio. Experimental results demonstrate that our proposed method can restore a wider domain of video scenes and is particularly effective when the sounding object in the scene is partially blinded.

翻译：人类能够基于音视频事件的先验知识，仅凭听觉信息轻松想象出场景。本文模拟人类这一天生能力，利用深度学习模型提升视频修复质量。为实现先验知识编码，我们首先训练音视频网络学习听觉与视觉信息之间的对应关系；随后将该音视频网络作为引导器，将音视频对应关系的先验知识传递至视频修复网络。该先验知识通过我们提出的两项新型损失函数进行迁移：音视频注意力损失与音视频伪类一致性损失。这两项损失通过增强修复结果与其同步音频的高度对应性，进一步提升视频修复性能。实验结果表明，本方法能够恢复更广泛场景的视频内容，尤其适用于场景中发声物体部分遮挡的情况。

0

相关内容

图像修复

图像修复（英语：Inpainting）指重建的图像和视频中丢失或损坏的部分的过程。例如在博物馆中，这项工作常由经验丰富的博物馆管理员或者艺术品修复师来进行。数码世界中，图像修复又称图像插值或视频插值，指利用复杂的算法来替换已丢失、损坏的图像数据，主要替换一些小区域和瑕疵。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

水窗及水窗附近软X射线激光研究

国家自然科学基金

0+阅读 · 2014年12月31日

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow

Arxiv

0+阅读 · 2023年11月28日

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

Arxiv

0+阅读 · 2023年11月28日

Continuously Controllable Facial Expression Editing in Talking Face Videos

Arxiv

0+阅读 · 2023年11月28日

Enhanced Low-Complexity FDD System Feedback with Variable Bit Lengths via Generative Modeling

Arxiv

0+阅读 · 2023年11月28日

A Closer Look at Audio-Visual Segmentation

Arxiv

0+阅读 · 2023年11月27日

Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution

Arxiv

0+阅读 · 2023年11月24日

Weakly-Supervised Video Moment Retrieval via Regularized Two-Branch Proposal Networks with Erasing Mechanism

Arxiv

0+阅读 · 2023年11月23日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

VIP会员

文章信息

相关主题

知识 (knowledge)

最新内容

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

专知会员服务

3+阅读 · 今天7:54

《多域战场上反制小型无人机系统》150页

《多域战场上反制小型无人机系统》150页

专知会员服务

8+阅读 · 今天7:47

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

专知会员服务

2+阅读 · 今天7:43

战场人工智能：增强陆地作战能力的发现与要求

战场人工智能：增强陆地作战能力的发现与要求

专知会员服务

2+阅读 · 今天7:37

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

专知会员服务

3+阅读 · 今天7:33

以人工智能为中心的指挥控制

以人工智能为中心的指挥控制

专知会员服务

1+阅读 · 今天7:14

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

专知会员服务

2+阅读 · 今天4:15

俄乌冲突背景下军事特种公路运输日益增长的重要性

俄乌冲突背景下军事特种公路运输日益增长的重要性

专知会员服务

2+阅读 · 今天3:44

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

专知会员服务

7+阅读 · 6月10日

《基于深度强化学习的反无人机技术研究》178页

《基于深度强化学习的反无人机技术研究》178页

专知会员服务

10+阅读 · 6月10日

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

专知会员服务

6+阅读 · 6月10日

“史诗怒火”行动与“AI中心战”模式的浮现

“史诗怒火”行动与“AI中心战”模式的浮现

专知会员服务

10+阅读 · 6月10日

【CVPR2026教程】扩散模型的解析理解

【CVPR2026教程】扩散模型的解析理解

专知会员服务

5+阅读 · 6月10日

【CVPR2026教程】从感知到模拟：多模态推理中世界模型的涌现

【CVPR2026教程】从感知到模拟：多模态推理中世界模型的涌现

专知会员服务

5+阅读 · 6月10日

马赛克战：俄乌战场透析

马赛克战：俄乌战场透析

专知会员服务

16+阅读 · 6月10日

相关VIP内容

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《多域战场上反制小型无人机系统》150页

战场人工智能：增强陆地作战能力的发现与要求

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow

Arxiv

0+阅读 · 2023年11月28日

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

Arxiv

0+阅读 · 2023年11月28日

Continuously Controllable Facial Expression Editing in Talking Face Videos

Arxiv

0+阅读 · 2023年11月28日

Enhanced Low-Complexity FDD System Feedback with Variable Bit Lengths via Generative Modeling

Arxiv

0+阅读 · 2023年11月28日

A Closer Look at Audio-Visual Segmentation

Arxiv

0+阅读 · 2023年11月27日

Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution

Arxiv

0+阅读 · 2023年11月24日

Weakly-Supervised Video Moment Retrieval via Regularized Two-Branch Proposal Networks with Erasing Mechanism

Arxiv

0+阅读 · 2023年11月23日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

水窗及水窗附近软X射线激光研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员