This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries, and the accompanying CryCeleb 2023 task - a public speaker verification challenge based on infant cry sounds. We release for academic usage more than 6 hours of manually segmented cry sounds from 786 newborns to encourage research in infant cry analysis.
翻译:本文介绍了 Ubenwa CryCeleb 数据集——一个带标注的婴儿哭声集合,以及配套的 CryCeleb 2023 任务——一项基于婴儿哭声的公开说话人验证挑战。我们向学术界发布了超过 6 小时、来自 786 名新生儿的经人工分割的哭声录音,以推动婴儿哭声分析领域的研究。