Speech disfluency modeling is the bottleneck for both speech therapy and language learning. However, there is no effective AI solution to systematically tackle this problem. We solidify the concept of disfluent speech and disfluent speech modeling. We then present Hierarchical Unconstrained Disfluency Modeling (H-UDM) approach, the hierarchical extension of UDM that addresses both disfluency transcription and detection to eliminate the need for extensive manual annotation. Our experimental findings serve as clear evidence of the effectiveness and reliability of the methods we have introduced, encompassing both transcription and detection tasks.
翻译:口语不流畅建模是言语治疗和语言学习中的瓶颈。然而,目前尚无有效的人工智能解决方案能够系统性地解决这一问题。我们巩固了不流畅言语及不流畅言语建模的概念。随后,我们提出了层级化无约束不流畅建模方法(H-UDM),这是对无约束不流畅建模(UDM)的层级化扩展,旨在同时解决不流畅转录与检测问题,从而消除对大量人工标注的需求。我们的实验结果为所引入的方法(涵盖转录与检测任务)的有效性和可靠性提供了明确证据。