Mispronunciation Detection and Diagnosis (MDD) has gained increasing importance in computer-assisted language learning and speech technology in recent years. In this paper, we propose a method for constructing statistical graphs that enable models to learn phoneme confusion patterns represented as directed graphs. Furthermore, we introduce a language-specific strategy to capture systematic pronunciation differences across various native language (L1) backgrounds. The effectiveness of our approach is demonstrated through extensive experiments on the L2-ARCTIC benchmark, where it achieves an F1-score of 59.52%, outperforming several competitive baselines.
翻译:误发音检测与诊断在计算机辅助语言学习和语音技术领域近年来越发重要。本文提出一种构建统计图的方法,使模型能够学习以有向图表示的音位混淆模式。此外,我们引入一种语言特异性策略,以捕捉不同母语背景下的系统性发音差异。通过L2-ARCTIC基准数据集上的广泛实验,本文方法获得了59.52%的F1分数,优于多个具有竞争力的基线模型,验证了其有效性。