We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. To construct OpenMU-Bench, we leveraged existing datasets and bootstrapped new annotations. OpenMU-Bench also broadens the scope of music understanding by including lyrics understanding and music tool usage. Using OpenMU-Bench, we trained our music understanding model, OpenMU, with extensive ablations, demonstrating that OpenMU outperforms baseline models such as MU-Llama. Both OpenMU and OpenMU-Bench are open-sourced to facilitate future research in music understanding and to enhance creative music production efficiency.
翻译:我们提出了OpenMU-Bench,这是一个用于解决训练多模态语言模型理解音乐时数据稀缺问题的大规模基准测试套件。为构建OpenMU-Bench,我们利用了现有数据集并引导生成了新的标注。该基准还通过纳入歌词理解与音乐工具使用,拓宽了音乐理解的范畴。基于OpenMU-Bench,我们训练了音乐理解模型OpenMU,并进行了广泛的消融实验,结果表明OpenMU在性能上超越了MU-Llama等基线模型。OpenMU模型与OpenMU-Bench基准均已开源,以促进未来音乐理解研究并提升创意音乐制作效率。