As generative models have risen in popularity, a domain that has risen alongside is generative models for music. Our study aims to compare the performance of a simple Markov chain model and a recurrent neural network (RNN) model, two popular models for sequence generating tasks, in jazz music improvisation. While music, especially jazz, remains subjective in telling whether a composition is "good" or "bad", we aim to quantify our results using metrics of groove pattern similarity and pitch class histogram entropy. We trained both models using transcriptions of jazz blues choruses from professional jazz players, and also fed musical jazz seeds to help give our model some context in beginning the generation. Our results show that the RNN outperforms the Markov model on both of our metrics, indicating better rhythmic consistency and tonal stability in the generated music. Through the use of music21 library, we tokenized our jazz dataset into pitches and durations that our model could interpret and train on. Our findings contribute to the growing field of AI-generated music, highlighting the important use of metrics to assess generation quality. Future work includes expanding the dataset of MIDI files to a larger scale, conducting human surveys for subjective evaluations, and incorporating additional metrics to address the challenge of subjectivity in music evaluation. Our study provides valuable insight into the use of recurrent neural networks for sequential based tasks like generating music.
翻译:随着生成模型的日益流行,音乐生成领域也随之蓬勃发展。本研究旨在比较简单马尔可夫链模型与循环神经网络(RNN)模型在爵士乐即兴创作中的表现——这两种模型都是序列生成任务中的常用方法。尽管音乐(尤其是爵士乐)在评判作品"好"或"坏"时具有主观性,我们仍试图通过律动模式相似性与音高类别直方图熵这两项指标来量化结果。我们使用专业爵士乐手演奏的爵士布鲁斯乐句转录数据对两种模型进行训练,同时输入音乐种子片段以帮助模型在生成初期建立上下文语境。研究结果表明,RNN在两项指标上均优于马尔可夫模型,说明其生成的音乐具有更优的节奏一致性与调性稳定性。通过music21库,我们将爵士数据集中的音高与时值进行了分词处理,使其成为模型可识别和训练的输入格式。本研究为不断发展的AI音乐生成领域做出了贡献,凸显了使用评估指标衡量生成质量的重要性。未来工作包括将MIDI文件数据集扩展至更大规模、开展人工问卷调查以进行主观评价,并引入更多评估指标以应对音乐评价中的主观性挑战。本研究为使用循环神经网络执行音乐生成等序列任务提供了重要参考。