Inspired by the success of recent data augmentation methods for signals which act on time-frequency representations, we introduce an operator which convolves the short-time Fourier transform of a signal with a specified kernel. Analytical properties including boundedness, compactness and positivity are investigated from the perspective of time-frequency analysis. A convolutional neural network and a vision transformer are trained to classify audio signals using spectrograms with different augmentation setups, including the above mentioned time-frequency blurring operator, with results indicating that the operator can significantly improve test performance, especially in the data-starved regime.
翻译:受近期基于时频表示进行信号数据增强方法成功的启发,本文引入一种算子,该算子将信号的短时傅里叶变换与指定核函数进行卷积。从时频分析的角度研究了其分析性质,包括有界性、紧致性和正性。我们训练了卷积神经网络和视觉变换器,使用不同增强设置(包括上述时频模糊算子)下的频谱图对音频信号进行分类。结果表明,该算子能够显著提升测试性能,特别是在数据匮乏的情况下。