Polyphonic music generation is still a challenge direction due to its correct between generating melody and harmony. Most of the previous studies used RNN-based models. However, the RNN-based models are hard to establish the relationship between long-distance notes. In this paper, we propose a polyphonic music generation neural network named Choir Transformer[ https://github.com/Zjy0401/choir-transformer], with relative positional attention to better model the structure of music. We also proposed a music representation suitable for polyphonic music generation. The performance of Choir Transformer surpasses the previous state-of-the-art accuracy of 4.06%. We also measures the harmony metrics of polyphonic music. Experiments show that the harmony metrics are close to the music of Bach. In practical application, the generated melody and rhythm can be adjusted according to the specified input, with different styles of music like folk music or pop music and so on.
翻译:复调音乐生成因需同时协调旋律与和声而仍具挑战性。以往研究多采用基于循环神经网络(RNN)的模型,但这类模型难以建立远距离音符之间的关联。本文提出一种名为Choir Transformer(https://github.com/Zjy0401/choir-transformer)的复调音乐生成神经网络,通过引入相对位置注意力机制以更好地建模音乐结构。我们同时提出了一种适用于复调音乐生成的音乐表示方法。Choir Transformer的性能较此前最优方法提升了4.06%的准确率。我们还对复调音乐的和声指标进行了量化评估,实验表明其和声指标已接近巴赫作品的音乐水准。在实际应用中,可根据指定输入调整生成的旋律与节奏,并适配民谣、流行音乐等不同风格。