Low frequency oscillator (LFO) driven audio effects such as phaser, flanger, and chorus, modify an input signal using time-varying filters and delays, resulting in characteristic sweeping or widening effects. It has been shown that these effects can be modeled using neural networks when conditioned with the ground truth LFO signal. However, in most cases, the LFO signal is not accessible and measurement from the audio signal is nontrivial, hindering the modeling process. To address this, we propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. Since our system imposes no restrictions on the LFO signal shape, we demonstrate its ability to extract quasiperiodic, combined, and distorted modulation signals that are relevant to effect modeling. Furthermore, we show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects using only dry and wet audio pairs, overcoming the need to access the audio effect or internal LFO signal. We make our code available and provide the trained audio effect models in a real-time VST plugin.
翻译:低频振荡器(LFO)驱动的音频效果(如移相器、镶边器和合唱效果)通过时变滤波器和延迟对输入信号进行调制,从而产生特征性的扫频或展宽效果。已有研究表明,当以真实LFO信号作为条件时,此类效果可通过神经网络建模。然而,多数情况下LFO信号不可访问,且从音频信号中测量其参数也颇具难度,这阻碍了建模过程。针对该问题,我们提出了一种框架,该框架能够从经过多种数字音频效果、参数设置及乐器配置处理的音频中提取任意LFO信号。由于系统对LFO信号波形无限制,我们验证了其提取准周期、复合及失真调制信号的能力——这些信号与效果建模直接相关。此外,我们展示了如何将提取模型与简单处理网络耦合,仅通过干湿音频配对即可训练面向未知模拟或数字LFO驱动音频效果的端到端黑箱模型,从而无需访问音频效果元件或内部LFO信号。我们已开放代码,并以实时VST插件形式提供了训练后的音频效果模型。