This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.
翻译:本文描述了我们在国际口语翻译会议(IWSLT 2024)上提交的爱尔兰语到英语语音翻译系统。我们基于Whisper构建了端到端系统,并采用了多种数据增强技术,包括语音反向翻译与噪声增强。我们探究了使用合成音频数据的效果,并讨论了多种增强信号多样性的方法。