Music Structure Analysis (MSA) is a Music Information Retrieval task consisting of representing a song in a simplified, organized manner by breaking it down into sections typically corresponding to ``chorus'', ``verse'', ``solo'', etc. In this work, we extend an MSA algorithm called the Correlation Block-Matching (CBM) algorithm introduced by (Marmoret et al., 2020, 2022b). The CBM algorithm is a dynamic programming algorithm that segments self-similarity matrices, which are a standard description used in MSA and in numerous other applications. In this work, self-similarity matrices are computed from the feature representation of an audio signal and time is sampled at the bar-scale. This study examines three different standard similarity functions for the computation of self-similarity matrices. Results show that, in optimal conditions, the proposed algorithm achieves a level of performance which is competitive with supervised state-of-the-art methods while only requiring knowledge of bar positions. In addition, the algorithm is made open-source and is highly customizable.
翻译:音乐结构分析(Music Structure Analysis, MSA)是一项音乐信息检索任务,其目标是将歌曲以简化、结构化的方式呈现,通过将其分解为通常对应“副歌”、“主歌”、“独奏”等段落。在本研究中,我们对(Marmoret et al., 2020, 2022b)提出的相关块匹配(Correlation Block-Matching, CBM)算法进行了扩展。CBM算法是一种动态规划算法,用于分割自相似性矩阵——这一标准描述广泛应用于MSA及众多其他领域。本文中,自相似性矩阵通过音频信号的特征表示计算得出,时间以小节尺度进行采样。研究考察了三种不同的标准相似性函数用于计算自相似性矩阵。结果表明,在最优条件下,所提算法在仅需了解小节位置的前提下,其性能可达到与有监督最先进方法相竞争的水平。此外,该算法已开源且具有高度可定制性。