This study aims to enhance the quality of music generation using Transformers by incorporating meta-information. While Transformer-based approaches are effective at capturing long-term dependencies in musical compositions, the music they generate often suffers from issues such as excessive repetition or duplication of notes, leading to unnatural melodies. To address these limitations, we propose Musical Attention, a mechanism that incorporates meta-information such as bar numbers, key, signatures, and tempos into the attention process. Musical Attention explicitly leverages both the structural properties of music and its associated metadata, enabling the Transformer's attention mechanism to operate more effectively and thereby improving the quality of the generated output. In our framework, each musical note is represented as a combination of five events-pitch, bar number, onset, duration, and velocity in addition to the three metadata elements. The attention mechanism is then modified to reflect the correlations among these eight features, allowing the model to better capture the inherent characteristics of musical composition. Experimental results demonstrate that the model incorporating Musical Attention outperforms prior methods, such as Full Attention and Strided Attention, in terms of musical coherence, variation, and overall quality. Notably, it significantly reduces repetition and enhances the model's ability to generate diverse, harmonically consistent melodies. Musical Attention thus represents a meaningful advancement in AI-driven music generation, facilitating the creation of more natural and expressive compositions.
翻译:本研究旨在通过融入元信息提升基于Transformer的音乐生成质量。尽管基于Transformer的方法能有效捕捉音乐作品中的长期依赖关系,但其生成的音乐常存在过度重复或音符重复等问题,导致旋律不自然。为解决这些局限,我们提出音乐注意力机制,该机制将小节编号、调号、拍号与速度等元信息纳入注意力过程。音乐注意力显式利用音乐的结构属性及其关联元数据,使Transformer的注意力机制更高效运作,从而提升生成输出的质量。在我们的框架中,每个音符被表示为五个事件(音高、小节编号、起始时间、持续时间和力度)与三种元数据元素的组合。随后对注意力机制进行修改以反映这八个特征之间的相关性,使模型能更好地捕捉音乐创作的内在特性。实验结果表明,融入音乐注意力的模型在音乐连贯性、变奏性与整体质量方面均优于全注意力与步进注意力等既有方法。值得注意的是,该方法显著减少了重复并增强了模型生成多样化和声一致性旋律的能力。因此,音乐注意力代表了AI驱动音乐生成领域的重要进展,有助于创作更自然且富有表现力的音乐作品。