In recent years, the enhanced capabilities of ASR models and the emergence of multi-dialect datasets have increasingly pushed Arabic ASR model development toward an all-dialect-in-one direction. This trend highlights the need for benchmarking studies that evaluate model performance on multiple dialects, providing the community with insights into models' generalization capabilities. In this paper, we introduce Open Universal Arabic ASR Leaderboard, a continuous benchmark project for open-source general Arabic ASR models across various multi-dialect datasets. We also provide a comprehensive analysis of the model's robustness, speaker adaptation, inference efficiency, and memory consumption. This work aims to offer the Arabic ASR community a reference for models' general performance and also establish a common evaluation framework for multi-dialectal Arabic ASR models.
翻译:近年来,随着ASR模型性能的增强以及多方言数据集的出现,阿拉伯语ASR模型的开发日益朝着"全方言一体化"的方向发展。这一趋势凸显了基准测试研究的必要性,即评估模型在多种方言上的性能,从而为学界提供模型泛化能力的深入洞见。本文提出"开放式通用阿拉伯语ASR排行榜",这是一个针对开源通用阿拉伯语ASR模型在多种多方言数据集上的持续性基准测试项目。我们还对模型的鲁棒性、说话人适应性、推理效率及内存消耗进行了全面分析。本工作旨在为阿拉伯语ASR研究社区提供模型通用性能的参考基准,并为多方言阿拉伯语ASR模型建立统一的评估框架。