Federated learning (FL) has gained substantial attention in recent years due to the data privacy concerns related to the pervasiveness of consumer devices that continuously collect data from users. While a number of FL benchmarks have been developed to facilitate FL research, none of them include audio data and audio-related tasks. In this paper, we fill this critical gap by introducing a new FL benchmark for audio tasks which we refer to as FedAudio. FedAudio includes four representative and commonly used audio datasets from three important audio tasks that are well aligned with FL use cases. In particular, a unique contribution of FedAudio is the introduction of data noises and label errors to the datasets to emulate challenges when deploying FL systems in real-world settings. FedAudio also includes the benchmark results of the datasets and a PyTorch library with the objective of facilitating researchers to fairly compare their algorithms. We hope FedAudio could act as a catalyst to inspire new FL research for audio tasks and thus benefit the acoustic and speech research community. The datasets and benchmark results can be accessed at https://github.com/zhang-tuo-pdf/FedAudio.
翻译:联邦学习(FL)近年来因消费设备普遍收集用户数据引发的隐私问题而受到广泛关注。尽管已有多个FL基准被开发以促进相关研究,但其中均未包含音频数据及音频相关任务。本文通过引入面向音频任务的新FL基准——FedAudio,填补了这一关键空白。FedAudio包含来自三个重要音频任务的四个代表性常用数据集,这些任务与FL应用场景高度契合。特别地,FedAudio的独特贡献在于为数据集引入了数据噪声和标签错误,以模拟真实场景中部署FL系统面临的挑战。FedAudio还提供了数据集的基准测试结果及基于PyTorch的代码库,旨在帮助研究者公平地比较其算法。我们希望FedAudio能激发音频任务领域的新FL研究,从而惠及声学与语音研究社区。数据集及基准测试结果可通过https://github.com/zhang-tuo-pdf/FedAudio获取。