The performance of speaker verification systems degrades significantly under language mismatch, a critical challenge exacerbated by the field's reliance on English-centric data. To address this, we propose the TidyVoice Challenge for cross-lingual speaker verification. The challenge leverages the TidyVoiceX dataset from the novel TidyVoice benchmark, a large-scale, multilingual corpus derived from Mozilla Common Voice, and specifically curated to isolate the effect of language switching across approximately 40 languages. Participants will be tasked with building systems robust to this mismatch, with performance primarily evaluated using the Equal Error Rate on cross-language trials. By providing standardized data, open-source baselines, and a rigorous evaluation protocol, this challenge aims to drive research towards fairer, more inclusive, and language-independent speaker recognition technologies, directly aligning with the Interspeech 2026 theme, "Speaking Together."
翻译:说话人验证系统在语言不匹配情况下的性能会显著下降,这一关键挑战因该领域对以英语为中心的数据的依赖而加剧。为解决此问题,我们提出了用于跨语言说话人验证的TidyVoice挑战赛。该挑战赛利用了新颖的TidyVoice基准测试中的TidyVoiceX数据集,这是一个源自Mozilla Common Voice的大规模多语言语料库,并经过专门整理以隔离约40种语言间语言切换的影响。参赛者的任务是构建对此类不匹配具有鲁棒性的系统,其性能将主要通过跨语言测试上的等错误率进行评估。通过提供标准化数据、开源基线模型和严格的评估协议,本挑战赛旨在推动研究朝着更公平、更具包容性且与语言无关的说话人识别技术发展,这与Interspeech 2026“共同发声”的主题直接契合。