Recent advances in large language models (LLMs) have created new opportunities for symbolic music generation. However, existing formats such as MIDI, ABC, and MusicXML are either overly complex or structurally inconsistent, limiting their suitability for token-based learning architectures. To address these challenges, we propose HNote, a novel hexadecimal-based notation system extended from YNote, which encodes both pitch and duration within a fixed 32-unit measure framework. This design ensures alignment, reduces ambiguity, and is directly compatible with LLM architectures. We converted 12,300 Jiangnan-style songs generated from traditional folk pieces from YNote into HNote, and fine-tuned LLaMA-3.1(8B) using parameter-efficient LoRA. Experimental results show that HNote achieves a syntactic correctness rate of 82.5%, and BLEU and ROUGE evaluations demonstrate strong symbolic and structural similarity, producing stylistically coherent compositions. This study establishes HNote as an effective framework for integrating LLMs with cultural music modeling.
翻译:近年来,大型语言模型(LLMs)的进展为符号音乐生成创造了新的机遇。然而,现有格式如MIDI、ABC和MusicXML要么过于复杂,要么结构不一致,限制了它们在基于令牌的学习架构中的适用性。为应对这些挑战,我们提出了HNote,一种基于YNote扩展的新型十六进制记谱系统,它在固定的32单位小节框架内同时编码音高和时值。该设计确保了对齐性,减少了歧义,并可直接与LLM架构兼容。我们将12,300首由传统民间乐曲生成的江南风格歌曲从YNote转换为HNote,并使用参数高效的LoRA对LLaMA-3.1(8B)进行了微调。实验结果表明,HNote实现了82.5%的句法正确率,BLEU和ROUGE评估显示出强大的符号与结构相似性,能够生成风格连贯的作品。本研究确立了HNote作为一个将LLMs与文化音乐建模相结合的有效框架。