Deep generative models have been used in style transfer tasks for images. In this study, we adapt and improve CycleGAN model to perform music style transfer on Jazz and Classic genres. By doing so, we aim to easily generate new songs, cover music to different music genres and reduce the arrangements needed in those processes. We train and use music genre classifier to assess the performance of the transfer models. To that end, we obtain 87.7% accuracy with Multi-layer Perceptron algorithm. To improve our style transfer baseline, we add auxiliary discriminators and triplet loss to our model. According to our experiments, we obtain the best accuracies as 69.4% in Jazz to Classic task and 39.3% in Classic to Jazz task with our developed genre classifier. We also run a subjective experiment and results of it show that the overall performance of our transfer model is good and it manages to conserve melody of inputs on the transferred outputs. Our code is available at https://github.com/ fidansamet/tune-it-up
翻译:深度生成模型已在图像风格转换任务中得到应用。在本研究中,我们改进并优化了CycleGAN模型,以在爵士乐与古典乐流派间进行音乐风格转换。通过这一方法,我们旨在便捷地生成新曲目、将音乐改编至不同流派,并减少这些过程中所需的编曲工作。我们训练并采用音乐流派分类器来评估转换模型的性能,最终通过多层感知机算法获得了87.7%的准确率。为提升风格转换的基线效果,我们在模型中引入了辅助判别器与三元组损失函数。实验结果表明,使用我们开发的流派分类器进行评测时,在爵士乐转古典乐任务中取得了69.4%的最佳准确率,在古典乐转爵士乐任务中取得了39.3%的最佳准确率。我们还进行了主观听觉实验,结果显示转换模型整体表现良好,且能在转换后的输出中有效保留输入旋律的特征。相关代码已发布于https://github.com/fidansamet/tune-it-up。