We introduce the problem of phone classification in the context of speech recognition, and explore several sets of local spectro-temporal features that can be used for phone classification. In particular, we present some preliminary results for phone classification using two sets of features that are commonly used for object detection: Haar features and SVM-classified Histograms of Gradients (HoG)
翻译:我们介绍了语音识别中的音素分类问题,并探索了可用于音素分类的若干局部频谱-时间特征集。特别地,我们展示了使用两组常用于目标检测的特征(Haar特征和SVM分类的梯度直方图)进行音素分类的初步结果。