A prominent theory of affective response to music revolves around the concepts of surprisal and expectation. In prior work, this idea has been operationalized in the form of probabilistic models of music which allow for precise computation of song (or note-by-note) probabilities, conditioned on a 'training set' of prior musical or cultural experiences. To date, however, these models have been limited to compute exact probabilities through hand-crafted features or restricted to linear models which are likely not sufficient to represent the complex conditional distributions present in music. In this work, we propose to use modern deep probabilistic generative models in the form of a Diffusion Model to compute an approximate likelihood of a musical input sequence. Unlike prior work, such a generative model parameterized by deep neural networks is able to learn complex non-linear features directly from a training set itself. In doing so, we expect to find that such models are able to more accurately represent the 'surprisal' of music for human listeners. From the literature, it is known that there is an inverted U-shaped relationship between surprisal and the amount human subjects 'like' a given song. In this work we show that pre-trained diffusion models indeed yield musical surprisal values which exhibit a negative quadratic relationship with measured subject 'liking' ratings, and that the quality of this relationship is competitive with state of the art methods such as IDyOM. We therefore present this model a preliminary step in developing modern deep generative models of music expectation and subjective likability.
翻译:音乐情感反应的一个显著理论围绕惊讶度与期望概念展开。先前研究通过构建音乐概率模型将此理论操作化,该模型可在已知先前音乐或文化经验的"训练集"条件下,精确计算歌曲(或逐音符)的概率。然而,迄今为止这些模型仍局限于通过手工特征计算精确概率,或仅限于可能不足以表示音乐中复杂条件分布的线性模型。本研究提出采用现代深度概率生成模型——扩散模型,计算音乐输入序列的近似似然。与先前工作不同,这种由深度神经网络参数化的生成模型能够直接从训练集本身学习复杂的非线性特征。通过这种方法,我们预期此类模型能更准确模拟人类听众对音乐的"惊讶度"。文献表明,惊讶度与人类被试对给定歌曲的"喜欢"程度之间存在倒U形关系。本研究表明,预训练扩散模型产生的音乐惊讶度值确实与被测主体的"喜欢"评分呈现负二次关系,且该关系的质量可与IDyOM等最先进方法相媲美。因此,我们提出将此模型作为开发音乐期望与主观喜爱度现代深度生成模型的初步步骤。