In multilingual societies, social conversations often involve code-mixed speech. The current speech technology may not be well equipped to extract information from multi-lingual multi-speaker conversations. The DISPLACE challenge entails a first-of-kind task to benchmark speaker and language diarization on the same data, as the data contains multi-speaker conversations in multilingual code-mixed speech. The challenge attempts to highlight outstanding issues in speaker diarization (SD) in multilingual settings with code-mixing. Further, language diarization (LD) in multi-speaker settings also introduces new challenges, where the system has to disambiguate speaker switches with code switches. For this challenge, a natural multilingual, multi-speaker conversational dataset is distributed for development and evaluation purposes. The systems are evaluated on single-channel far-field recordings. We also release a baseline system and report the highlights of the system submissions.
翻译:在多语言社会中,社交对话常涉及语码混合语音。当前语音技术可能无法充分从多语言、多说话人对话中提取信息。DISPLACE挑战首次提出在同一数据上对说话人日志化(SD)和语言日志化(LD)进行基准测试的任务,该数据包含多语言语码混合的多说话人对话。该挑战旨在凸显多语言场景下语码混合的说话人日志化(SD)中的突出问题。此外,多说话人场景下的语言日志化(LD)也引入新挑战,系统需区分说话人切换与语码切换。为进行此挑战,我们分发了一个用于开发和评估的自然多语言、多说话人对话数据集。系统在单通道远场录音上进行评估。我们还发布了基准系统,并总结了系统提交的亮点。