The article describes an attempt to apply an ensemble of binary classifiers to solve the problem of speech assessment in medicine. A dataset was compiled based on quantitative and expert assessments of syllable pronunciation quality. Quantitative assessments of 7 selected metrics were used as features: dynamic time warp distance, Minkowski distance, correlation coefficient, longest common subsequence (LCSS), edit distance of real se-quence (EDR), edit distance with real penalty (ERP), and merge split (MSM). Expert as-sessment of pronunciation quality was used as a class label: class 1 means high-quality speech, class 0 means distorted. A comparison of training results was carried out for five classification methods: logistic regression (LR), support vector machine (SVM), naive Bayes (NB), decision trees (DT), and K-nearest neighbors (KNN). The results of using the mixture method to build an ensemble of classifiers are also presented. The use of an en-semble for the studied data sets allowed us to slightly increase the classification accuracy compared to the use of individual binary classifiers.
翻译:本文探讨了应用二元分类器集成解决医学领域语音评估问题的尝试。研究基于音节发音质量的定量指标与专家评估构建了数据集。采用7项选定指标的量化评估作为特征:动态时间规整距离、闵可夫斯基距离、相关系数、最长公共子序列(LCSS)、实序列编辑距离(EDR)、实惩罚编辑距离(ERP)以及合并分割距离(MSM)。发音质量的专家评估被用作类别标签:类别1代表高质量语音,类别0代表失真语音。针对五种分类方法进行了训练结果比较:逻辑回归(LR)、支持向量机(SVM)、朴素贝叶斯(NB)、决策树(DT)以及K近邻算法(KNN)。同时展示了采用混合方法构建分类器集成的结果。对于所研究的数据集,使用集成方法相较于单一二元分类器能够略微提升分类准确率。