We introduce the problem of phone classification in the context of speech recognition, and explore several sets of local spectro-temporal features that can be used for phone classification. In particular, we present some preliminary results for phone classification using two sets of features that are commonly used for object detection: Haar features and SVM-classified Histograms of Gradients (HoG).
翻译:我们引入了语音识别中的音素分类问题,并探索了几组可用于音素分类的局部谱时特征。具体而言,我们展示了使用两类常用于目标检测的特征(Haar特征和基于SVM分类的梯度方向直方图)进行音素分类的初步结果。