This work presents a generative neural network that's able to generate expressive piano performance in MIDI format. The musical expressivity is reflected by vivid micro-timing, rich polyphonic texture, varied dynamics, and the sustain pedal effects. This model is innovative from many aspects of data processing to neural network design. We claim that this symbolic music generation model overcame the common critics of symbolic music and is able to generate expressive music flows as good as, if not better than generations with raw audio. One drawback is that, due to the limited time for submission, the model is not fine-tuned and sufficiently trained, thus the generation may sound incoherent and random at certain points. Despite that, this model shows its powerful generative ability to generate expressive piano pieces.
翻译:本研究提出一种能够生成富有表现力的MIDI格式钢琴演奏的生成式神经网络。音乐表现力通过生动的微节奏、丰富的复调织体、变化的力度以及延音踏板效果得以体现。该模型从数据处理到神经网络设计的多个方面均具有创新性。我们主张,这一符号音乐生成模型克服了符号音乐常见的批评,能够生成不逊于(甚至优于)原始音频生成的表现性音乐流。不足之处在于,由于提交时间有限,模型未经过充分微调与训练,因此生成结果在某些片段可能显得不连贯或随机。尽管如此,该模型仍展现出生成富有表现力钢琴曲目的强大能力。