Recent years have seen increasing interest in applying deep learning methods to the modeling of guitar amplifiers or effect pedals. Existing methods are mainly based on the supervised approach, requiring temporally-aligned data pairs of unprocessed and rendered audio. However, this approach does not scale well, due to the complicated process involved in creating the data pairs. A very recent work done by Wright et al. has explored the potential of leveraging unpaired data for training, using a generative adversarial network (GAN)-based framework. This paper extends their work by using more advanced discriminators in the GAN, and using more unpaired data for training. Specifically, drawing inspiration from recent advancements in neural vocoders, we employ in our GAN-based model for guitar amplifier modeling two sets of discriminators, one based on multi-scale discriminator (MSD) and the other multi-period discriminator (MPD). Moreover, we experiment with adding unprocessed audio signals that do not have the corresponding rendered audio of a target tone to the training data, to see how much the GAN model benefits from the unpaired data. Our experiments show that the proposed two extensions contribute to the modeling of both low-gain and high-gain guitar amplifiers.
翻译:近年来,深度学习在吉他放大器与效果器建模中的应用日益受到关注。现有方法主要基于监督学习范式,需要未经处理的原始音频与渲染后音频在时间轴上严格对齐的数据对。然而,由于数据对制作过程复杂,该方法难以实现大规模扩展。Wright等人近期研究探索了基于生成对抗网络框架利用非配对数据进行训练的潜力。本文通过采用更先进的GAN判别器与更多非配对训练数据,对其研究进行了拓展。具体而言,受神经声码器最新进展启发,我们在吉他放大器建模的GAN模型中采用了两组判别器:基于多尺度判别器的架构与基于多周期判别器的架构。此外,我们尝试在训练数据中加入无对应目标音色渲染音频的原始音频信号,以探究GAN模型能从非配对数据中获益的程度。实验表明,所提出的两项扩展对低增益与高增益吉他放大器的建模均有显著提升。