Diffusion-Based Audio Inpainting - 专知论文

会员服务 ·

0

图像修复 · Performer · MoDELS · state-of-the-art · Networking ·

2023 年 5 月 24 日

Diffusion-Based Audio Inpainting

翻译：基于扩散的音频修补

Eloi Moliner,Vesa Välimäki

from arxiv, Submitted for publication to the Journal of Audio Engineering Society on January 30th, 2023

Audio inpainting aims to reconstruct missing segments in corrupted recordings. Previous methods produce plausible reconstructions when the gap length is shorter than about 100\;ms, but the quality decreases for longer gaps. This paper explores recent advancements in deep learning and, particularly, diffusion models, for the task of audio inpainting. The proposed method uses an unconditionally trained generative model, which can be conditioned in a zero-shot fashion for audio inpainting, offering high flexibility to regenerate gaps of arbitrary length. An improved deep neural network architecture based on the constant-Q transform, which allows the model to exploit pitch-equivariant symmetries in audio, is also presented. The performance of the proposed algorithm is evaluated through objective and subjective metrics for the task of reconstructing short to mid-sized gaps. The results of a formal listening test show that the proposed method delivers a comparable performance against state-of-the-art for short gaps, while retaining a good audio quality and outperforming the baselines for the longest gap lengths tested, 150\;ms and 200\;ms. This work helps improve the restoration of sound recordings having fairly long local disturbances or dropouts, which must be reconstructed.

翻译：音频修补旨在重建受损录音中的缺失片段。以往方法在缺口长度小于约100毫秒时能产生合理的重建效果，但较长缺口的质量会下降。本文探讨了深度学习，特别是扩散模型在音频修补任务中的最新进展。所提方法使用无条件训练生成模型，可通过零样本方式条件化为音频修补，具备高度灵活性以再生任意长度的缺口。同时，提出了一种基于常数Q变换的改进深度神经网络架构，该架构使模型能够利用音频中的音高等变对称性。通过客观与主观指标评估了所提算法在重建短至中等长度缺口任务中的性能。正式听力测试结果表明，所提方法在短缺口方面与最先进技术性能相当，在测试的最长缺口（150毫秒和200毫秒）上保持了良好的音频质量并优于基线。本工作有助于改善对存在较长局部干扰或信号丢失的录音的重建。

0

相关内容

图像修复

图像修复（英语：Inpainting）指重建的图像和视频中丢失或损坏的部分的过程。例如在博物馆中，这项工作常由经验丰富的博物馆管理员或者艺术品修复师来进行。数码世界中，图像修复又称图像插值或视频插值，指利用复杂的算法来替换已丢失、损坏的图像数据，主要替换一些小区域和瑕疵。

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

南海东北部内孤立波的地震海洋学研究

国家自然科学基金

0+阅读 · 2015年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

纳米铁磁线磁畴壁本征棘轮效应的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于共聚焦干涉的表面等离子体在液态环境中的显微成像与传感方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

胶体晶体自组装缺陷的原位动态三维显微成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

微纳米尺度掺矿物掺合料水泥基材料的弹性性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

锆杂化乙烯基笼形倍半硅氧烷的合成与应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

非均匀增益液体激光介质的输出特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

南海水下珊瑚礁白化光学遥感方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Facial Reenactment Through a Personalized Generator

Arxiv

0+阅读 · 2023年7月12日

Diffusion Based Multi-Agent Adversarial Tracking

Arxiv

0+阅读 · 2023年7月12日

Metropolis Sampling for Constrained Diffusion Models

Arxiv

0+阅读 · 2023年7月11日

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Arxiv

0+阅读 · 2023年7月11日

Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback

Arxiv

0+阅读 · 2023年7月10日

Point Cloud Diffusion Models for Automatic Implant Generation

Arxiv

0+阅读 · 2023年7月10日

MiVOLO: Multi-input Transformer for Age and Gender Estimation

Arxiv

0+阅读 · 2023年7月10日

Nested Diffusion Processes for Anytime Image Generation

Arxiv

0+阅读 · 2023年7月7日

Concealed Object Detection for Passive Millimeter-Wave Security Imaging Based on Task-Aligned Detection Transformer

Arxiv

0+阅读 · 2023年7月7日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

VIP会员

文章信息

相关主题

state-of-the-art

最新内容

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

0+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

0+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

5+阅读 · 7月28日

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

4+阅读 · 7月28日

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

4+阅读 · 7月28日

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

4+阅读 · 7月28日

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

5+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

7+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

13+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

8+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

7+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

5+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

12+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

7+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

10+阅读 · 7月26日

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

博士论文 | 从算法到基础模型：强化学习的统一视角

《异构人类团队的协作决策过程混合建模研究》

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

面向国防作战的最佳自主与蜂群无人机技术

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Facial Reenactment Through a Personalized Generator

Arxiv

0+阅读 · 2023年7月12日

Diffusion Based Multi-Agent Adversarial Tracking

Arxiv

0+阅读 · 2023年7月12日

Metropolis Sampling for Constrained Diffusion Models

Arxiv

0+阅读 · 2023年7月11日

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Arxiv

0+阅读 · 2023年7月11日

Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback

Arxiv

0+阅读 · 2023年7月10日

Point Cloud Diffusion Models for Automatic Implant Generation

Arxiv

0+阅读 · 2023年7月10日

MiVOLO: Multi-input Transformer for Age and Gender Estimation

Arxiv

0+阅读 · 2023年7月10日

Nested Diffusion Processes for Anytime Image Generation

Arxiv

0+阅读 · 2023年7月7日

Concealed Object Detection for Passive Millimeter-Wave Security Imaging Based on Task-Aligned Detection Transformer

Arxiv

0+阅读 · 2023年7月7日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

相关基金

南海东北部内孤立波的地震海洋学研究

国家自然科学基金

0+阅读 · 2015年12月31日

马铃薯块茎发育过程中茉莉酸调控的磷酸化蛋白质组研究

国家自然科学基金

0+阅读 · 2014年12月31日

纳米铁磁线磁畴壁本征棘轮效应的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于共聚焦干涉的表面等离子体在液态环境中的显微成像与传感方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

胶体晶体自组装缺陷的原位动态三维显微成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

立方（cubic)-TiB的合成、晶体结构与物理性能

国家自然科学基金

0+阅读 · 2011年12月31日

微纳米尺度掺矿物掺合料水泥基材料的弹性性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

锆杂化乙烯基笼形倍半硅氧烷的合成与应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

非均匀增益液体激光介质的输出特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

南海水下珊瑚礁白化光学遥感方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员