The language diversity in India's education sector poses a significant challenge, hindering inclusivity. Despite the democratization of knowledge through online educational content, the dominance of English, as the internet's lingua franca, limits accessibility, emphasizing the crucial need for translation into Indian languages. Despite existing Speech-to-Speech Machine Translation (SSMT) technologies, the lack of intonation in these systems gives monotonous translations, leading to a loss of audience interest and disengagement from the content. To address this, our paper introduces a dataset with stress annotations in Indian English and also a Text-to-Speech (TTS) architecture capable of incorporating stress into synthesized speech. This dataset is used for training a stress detection model, which is then used in the SSMT system for detecting stress in the source speech and transferring it into the target language speech. The TTS architecture is based on FastPitch and can modify the variances based on stressed words given. We present an Indian English-to-Hindi SSMT system that can transfer stress and aim to enhance the overall quality and engagement of educational content.
翻译:印度教育领域的语言多样性带来了重大挑战,阻碍了教育包容性。尽管在线教育内容推动了知识普及,但英语作为互联网通用语言的主导地位限制了可访问性,凸显了将其翻译为印度语言的迫切需求。现有语音到语音机器翻译技术虽已存在,但由于系统缺乏语调变化,导致翻译结果单调乏味,使听众失去兴趣并脱离内容。为此,本文提出一个包含印度英语重音标注的数据集,以及一种能够将重音融入合成语音的文本到语音架构。该数据集用于训练重音检测模型,随后将其应用于语音到语音机器翻译系统,以检测源语言语音中的重音并将其迁移至目标语言语音。该文本到语音架构基于FastPitch,可根据给定重音词调整变元参数。我们构建了一套支持重音迁移的印度英语-印地语语音到语音机器翻译系统,旨在提升教育内容的整体质量与参与度。