Tracking the fundamental frequency (f0) of a monophonic instrumental performance is effectively a solved problem with several solutions achieving 99% accuracy. However, the related task of automatic music transcription requires a further processing step to segment an f0 contour into discrete notes. This sub-task of note segmentation is necessary to enable a range of applications including musicological analysis and symbolic music generation. Building on CREPE, a state-of-the-art monophonic pitch tracking solution based on a simple neural network, we propose a simple and effective method for post-processing CREPE's output to achieve monophonic note segmentation. The proposed method demonstrates state-of-the-art results on two challenging datasets of monophonic instrumental music. Our approach also gives a 97% reduction in the total number of parameters used when compared with other deep learning based methods.
翻译:追踪单声道器乐演奏的基频(f0)实际上是一个已解决的问题,多种解决方案的准确率可达99%。然而,自动音乐转录的相关任务需要进一步的处理步骤,将f0轮廓分割为离散音符。这种音符分割子任务对于实现音乐学分析和符号音乐生成等一系列应用至关重要。基于CREPE(一种基于简单神经网络的最先进的单声道音高追踪解决方案),我们提出了一种简单有效的方法,用于后处理CREPE的输出以实现单声道音符分割。所提出的方法在两个具有挑战性的单声道器乐数据集上展现了最先进的结果。与其他基于深度学习方法相比,我们的方法还使总参数数量减少了97%。