Keyword spotting (KWS) is one of the speech recognition tasks most sensitive to the quality of the feature representation. However, the research on KWS has traditionally focused on new model topologies, putting little emphasis on other aspects like feature extraction. This paper investigates the use of the multitaper technique to create improved features for KWS. The experimental study is carried out for different test scenarios, windows and parameters, datasets, and neural networks commonly used in embedded KWS applications. Experiment results confirm the advantages of using the proposed improved features.
翻译:关键词检测(KWS)是对特征表示质量最为敏感的语音识别任务之一。然而,传统上针对KWS的研究主要集中于新型模型架构,对特征提取等其他方面关注较少。本文研究了利用多窗谱技术为KWS构建改进特征的方法。实验研究针对嵌入式KWS应用中常见的不同测试场景、窗函数与参数、数据集及神经网络进行了系统验证。实验结果证实了采用所提改进特征的优越性。