The availability of prosodic information from speech signals is useful in a wide range of applications. However, deriving this information from speech signals can be a laborious task involving manual intervention. Therefore, the current work focuses on developing a tool that can provide prosodic annotations corresponding to a given speech signal, particularly for Indian languages. The proposed Segmentation with Intensity, Tones and Break Indices (SIToBI) tool provides time-aligned phoneme, syllable, and word transcriptions, syllable-level pitch contour annotations, break indices, and syllable-level relative intensity indices. The tool focuses more on syllable-level annotations since Indian languages are syllable-timed. Indians, regardless of the language they speak, may exhibit influences from other languages. As a result, other languages spoken in India may also exhibit syllable-timed characteristics. The accuracy of the annotations derived from the tool is analyzed by comparing them against manual annotations and the tool is observed to perform well. While the current work focuses on three languages, namely, Tamil, Hindi, and Indian English, the tool can easily be extended to other Indian languages and possibly other syllable-timed languages as well.
翻译:从语音信号中获取韵律信息在众多应用领域都具有重要价值。然而,从语音信号中提取此类信息通常需要大量人工干预,是一项繁琐的任务。因此,本研究致力于开发一款能够为给定语音信号提供韵律标注的工具,特别针对印度语言。所提出的基于强度、声调及间断指数的分割工具能够提供时间对齐的音素、音节和单词转写,音节层级的基频轮廓标注,间断指数以及音节层级的相对强度指数。该工具更侧重于音节层级的标注,因为印度语言属于音节计时型语言。无论使用何种语言,印度使用者都可能受到其他语言的影响。因此,印度境内使用的其他语言也可能呈现音节计时特征。通过将工具生成的标注结果与人工标注进行对比分析,验证了该工具标注的准确性,其表现良好。当前研究主要针对泰米尔语、印地语和印度英语三种语言,但该工具可轻松扩展至其他印度语言,甚至可能适用于其他音节计时型语言。