In this study, we present SeMaScore, generated using a segment-wise mapping and scoring algorithm that serves as an evaluation metric for automatic speech recognition tasks. SeMaScore leverages both the error rate and a more robust similarity score. We show that our algorithm's score generation improves upon the state-of-the-art BERTscore. Our experimental results show that SeMaScore corresponds well with expert human assessments, signal-to-noise ratio levels, and other natural language metrics. We outperform BERTscore by 41x in metric computation speed. Overall, we demonstrate that SeMaScore serves as a more dependable evaluation metric, particularly in real-world situations involving atypical speech patterns.
翻译:本研究提出SeMaScore,该指标通过分段映射与评分算法生成,可作为自动语音识别任务的评估指标。SeMaScore综合运用错误率与更稳健的相似度得分。我们证明,该算法的评分机制优于当前最先进的BERTscore。实验结果表明,SeMaScore与专家人工评估、信噪比水平及其他自然语言指标均呈现良好一致性。在指标计算速度上,我们较BERTscore提升41倍。总体而言,SeMaScore被证实为更可靠的评估指标,尤其适用于涉及非典型语音模式的真实场景。