We revisit the problems of pitch spelling and tonality guessing with a new algorithm for their joint estimation from a MIDI file including information about the measure boundaries. Our algorithm does not only identify a global key but also local ones all along the analyzed piece. It uses Dynamic Programming techniques to search for an optimal spelling in term, roughly, of the number of accidental symbols that would be displayed in the engraved score. The evaluation of this number is coupled with an estimation of the global key and some local keys, one for each measure. Each of the three informations is used for the estimation of the other, in a multi-steps procedure. An evaluation conducted on a monophonic and a piano dataset, comprising 216 464 notes in total, shows a high degree of accuracy, both for pitch spelling (99.5% on average on the Bach corpus and 98.2% on the whole dataset) and global key signature estimation (93.0% on average, 95.58% on the piano dataset). Designed originally as a backend tool in a music transcription framework, this method should also be useful in other tasks related to music notation processing.
翻译:我们重新探讨了音高拼写与调性推测问题,提出了一种基于MIDI文件(包含小节边界信息)进行联合估计的新算法。该算法不仅能识别全局调性,还能对分析曲目中的局部调性进行逐段识别。它采用动态规划技术,以乐谱刻印中意外符号显示数量为优化目标,搜索最优拼写方案。该数量的评估与全局调性以及每个小节的局部调性估计相结合。在三者多步骤的相互估计过程中,每个信息单元均用于辅助其他信息的推测。在包含总计216,464个音符的单声道与钢琴数据集上的评估显示,该方法在音高拼写(巴赫语料库平均99.5%,全数据集平均98.2%)及全局调号估计(平均93.0%,钢琴数据集95.58%)中均达到了高精度。该算法最初设计为音乐转录框架的后端工具,亦有望应用于音乐符号处理相关的其他任务。