The primary goal of the L3DAS23 Signal Processing Grand Challenge at ICASSP 2023 is to promote and support collaborative research on machine learning for 3D audio signal processing, with a specific emphasis on 3D speech enhancement and 3D Sound Event Localization and Detection in Extended Reality applications. As part of our latest competition, we provide a brand-new dataset, which maintains the same general characteristics of the L3DAS21 and L3DAS22 datasets, but with first-order Ambisonics recordings from multiple reverberant simulated environments. Moreover, we start exploring an audio-visual scenario by providing images of these environments, as perceived by the different microphone positions and orientations. We also propose updated baseline models for both tasks that can now support audio-image couples as input and a supporting API to replicate our results. Finally, we present the results of the participants. Further details about the challenge are available at https://www.l3das.com/icassp2023.
翻译:L3DAS23信号处理大挑战赛(ICASSP 2023)的主要目标是促进并支持面向三维音频信号处理的机器学习协作研究,特别聚焦于扩展现实应用中的三维语音增强与三维声音事件定位与检测。作为本届竞赛,我们提供了一个全新数据集,该数据集保持与L3DAS21和L3DAS22数据集相同的基本特征,但采用来自多个混响仿真环境的一阶Ambisonics录音。此外,我们通过提供这些环境的图像(由不同麦克风位置和方向所感知)初步探索了音频-视觉场景。我们还为两项任务提出了更新的基线模型,这些模型现已支持音频-图像对作为输入,并提供了辅助API以供复现结果。最后,我们展示了参赛者的成果。挑战赛详情请参见https://www.l3das.com/icassp2023。