Music has been commonly recognized as a means of expressing emotions. In this sense, an intense debate emerges from the need to verbalize musical emotions. This concern seems highly relevant today, considering the exponential growth of natural language processing using deep learning models where it is possible to prompt semantic propositions to generate music automatically. This scoping review aims to analyze and discuss the possibilities of music generation conditioned by emotions. To address this topic, we propose a historical perspective that encompasses the different disciplines and methods contributing to this topic. In detail, we review two main paradigms adopted in automatic music generation: rules-based and machine-learning models. Of note are the deep learning architectures that aim to generate high-fidelity music from textual descriptions. These models raise fundamental questions about the expressivity of music, including whether emotions can be represented with words or expressed through them. We conclude that overcoming the limitation and ambiguity of language to express emotions through music, some of the use of deep learning with natural language has the potential to impact the creative industries by providing powerful tools to prompt and generate new musical works.
翻译:音乐常被视为表达情感的一种媒介。在此意义上,围绕如何用语言描述音乐情感这一需求展开激烈辩论。鉴于基于深度学习模型的自然语言处理正在指数级增长,人们可以通过语义提示自动生成音乐,这一议题在当下显得尤为相关。本综述旨在分析并探讨以情感为条件的音乐生成的可能性。为论述此主题,我们提出一个涵盖不同学科与方法的历时性视角。具体而言,我们综述了自动音乐生成的两种主要范式:基于规则的模型与机器学习模型。值得注意的是,旨在从文本描述生成高保真音乐的深度学习架构引发了关于音乐表现力的根本性问题,包括情感是否能够被词语表征或通过词语表达。我们的结论是:在克服语言表达音乐情感的局限性与模糊性方面,将深度学习与自然语言结合的方法具有潜力,能够为创意产业提供强大的工具以提示并生成新的音乐作品。