Recurrent neural networks (RNNs) are effective at emulating the non-linear, stateful behavior of analog guitar amplifiers and distortion effects. Unlike the case of direct circuit simulation, RNNs have a fixed sample rate encoded in their model weights, making the sample rate non-adjustable during inference. Recent work has proposed increasing the sample rate of RNNs at inference (oversampling) by increasing the feedback delay length in samples, using a fractional delay filter for non-integer conversions. Here, we investigate the task of lowering the sample rate at inference (undersampling), and propose using an extrapolation filter to approximate the required fractional signal advance. We consider two filter design methods and analyse the impact of filter order on audio quality. Our results show that the correct choice of filter can give high quality results for both oversampling and undersampling; however, in some cases the sample rate adjustment leads to unwanted artefacts in the output signal. We analyse these failure cases through linearised stability analysis, showing that they result from instability around a fixed point. This approach enables an informed prediction of suitable interpolation filters for a given RNN model before runtime.
翻译:循环神经网络(RNN)能有效模拟模拟吉他放大器和失真效果的非线性、有状态行为。与直接电路仿真不同,RNN的模型权重中编码了固定的采样率,导致推理过程中采样率无法调整。近期研究提出通过增加样本反馈延迟长度(使用分数延迟滤波器处理非整数转换)在推理时提升RNN采样率(过采样)。本文研究降低推理时采样率(欠采样)的任务,并提出使用外推滤波器来近似所需的分数信号超前量。我们探讨两种滤波器设计方法,并分析滤波器阶数对音频质量的影响。实验结果表明,正确选择滤波器可在过采样和欠采样中均获得高质量结果;然而在某些情况下,采样率调整会导致输出信号产生不良伪影。我们通过线性化稳定性分析研究这些失效案例,证明其源于固定点附近的不稳定性。该方法可在运行前为给定RNN模型实现插值滤波器的可预测性设计。