Acoustic recognition is a common task for deep learning in recent researches, with the employment of spectral feature extraction such as Short-time Fourier transform and Wavelet transform. However, not many researches have found that discuss the advantages and drawbacks, as well as performance comparison amongst spectral feature extractors. In this consideration, this paper aims to comparing the attributes of these two transform types, called spectrogram and scalogram. A Convolutional Neural Networks for acoustic faults recognition is implemented, then the performance of these two types of spectral extractor is recorded for comparison. A latest research on the same audio database is considered for benchmarking to see how good the designed spectrogram and scalogram is. The advantages and limitations of them are also analyzed. By doing so, the results of this paper provide indications for application scenarios of spectrogram and scalogram, as well as potential further research directions in acoustic recognition.
翻译:声学识别是近年来深度学习中的常见任务,常采用短时傅里叶变换和小波变换等频谱特征提取方法。然而,现有研究鲜少讨论频谱特征提取器的优缺点及性能对比。为此,本文旨在比较这两种变换类型(即声谱图与尺度图)的属性。通过实现用于声学故障识别的卷积神经网络,记录并对比这两种频谱提取器的性能。以同一音频数据库的最新研究为基准,评估所设计声谱图和尺度图的效果,并分析其优势与局限性。研究结果表明,本文为声谱图和尺度图的应用场景及声学识别的潜在研究方向提供了参考依据。