Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts. To further explore and enhance LLMs' potential in music composition by leveraging their reasoning ability and the large knowledge base in music history and theory, we propose ComposerX, an agent-based symbolic music generation framework. We find that applying a multi-agent approach significantly improves the music composition quality of GPT-4. The results demonstrate that ComposerX is capable of producing coherent polyphonic music compositions with captivating melodies, while adhering to user instructions.
翻译:音乐创作展现了人类富有创造力的一面,其本身是一项复杂的任务,要求具备理解与生成具有长程依赖及和声约束信息的能力。尽管当前大型语言模型在STEM学科中展现了令人瞩目的能力,但在此任务上仍容易失败——即便运用了上下文学习与思维链等现代技术,其生成的音乐仍质量欠佳。为了进一步探索和提升大型语言模型在音乐创作中的潜力,充分利用其推理能力以及音乐史与理论领域庞大的知识库,我们提出了基于智能体的符号音乐生成框架ComposerX。我们发现,采用多智能体方法能显著提升GPT-4的音乐创作质量。实验结果表明,ComposerX能够在遵循用户指令的同时,创作出旋律动人、连贯悦耳的复调音乐作品。