Music accompaniment generation is a crucial aspect in the composition process. Deep neural networks have made significant strides in this field, but it remains a challenge for AI to effectively incorporate human emotions to create beautiful accompaniments. Existing models struggle to effectively characterize human emotions within neural network models while composing music. To address this issue, we propose the use of an easy-to-represent emotion flow model, the Valence/Arousal Curve, which allows for the compatibility of emotional information within the model through data transformation and enhances interpretability of emotional factors by utilizing a Variational Autoencoder as the model structure. Further, we used relative self-attention to maintain the structure of the music at music phrase level and to generate a richer accompaniment when combined with the rules of music theory.
翻译:音乐伴奏生成是作曲过程中的关键环节。深度神经网络在该领域取得了显著进展,但人工智能如何有效融入人类情感以创作优美的伴奏仍是一大挑战。现有模型在作曲时难以在神经网络模型中有效表征人类情感。针对这一问题,我们提出采用易于表征的情感流模型——效价/唤醒曲线,通过数据转换实现情感信息在模型中的兼容性,并利用变分自编码器作为模型结构增强情感因素的可解释性。此外,我们采用相对自注意力机制维持音乐在乐句层面的结构,结合乐理规则生成更丰富的伴奏。