Acoustic recognition is a common task for deep learning in recent researches, with the employment of spectral feature extraction such as Short-time Fourier transform and Wavelet transform. However, not many researches have found that discuss the advantages and drawbacks, as well as performance comparison of them. In this consideration, this paper aims to comparing the attributes of these two transforms, called spectrogram and scalogram. A Convolutional Neural Networks for acoustic faults recognition is implemented, then the performance of them is recorded for comparison. A latest research on the same audio database is considered for benchmarking to see how good the designed spectrogram and scalogram is. The advantages and limitations of them are also analyzed. By doing so, the results of this paper provide indications for application scenarios of spectrogram and scalogram, as well as potential further research directions.
翻译:声学识别是近年来深度学习领域的常见任务,通常采用短时傅里叶变换和小波变换等频谱特征提取方法。然而,目前鲜有研究探讨这两种方法的优势与不足,并对其性能进行系统对比。基于此,本文旨在比较谱图与尺度图这两种变换的特征属性。通过构建用于声学故障识别的卷积神经网络模型,记录并对比了两种特征的识别性能。同时,以同一音频数据库上的最新研究作为基准,评估所设计谱图与尺度图的有效性,并分析各自的优势与局限性。研究结果表明,本文成果可为谱图与尺度图的应用场景选择及后续研究方向提供参考依据。