We report on the curation of several publicly available datasets for age and gender prediction. Furthermore, we present experiments to predict age and gender with models based on a pre-trained wav2vec 2.0. Depending on the dataset, we achieve an MAE between 7.1 years and 10.8 years for age, and at least 91.1% ACC for gender (female, male, child). Compared to a modelling approach built on handcrafted features, our proposed system shows an improvement of 9% UAR for age and 4% UAR for gender. To make our findings reproducible, we release the best performing model to the community as well as the sample lists of the data splits.
翻译:本文报告了多个公开数据集的整理工作,这些数据集用于年龄和性别预测。此外,我们开展了基于预训练wav2vec 2.0模型的年龄与性别预测实验。根据数据集不同,年龄预测的MAE(平均绝对误差)介于7.1年至10.8年之间,性别(女性、男性、儿童)预测的ACC(准确率)至少达到91.1%。与基于手工特征构建的建模方法相比,我们提出的系统在年龄预测上提升了9%的UAR,在性别预测上提升了4%的UAR。为确保研究结果的可复现性,我们向研究社区发布了性能最优的模型以及数据划分的样本列表。