This paper describes a data-driven framework to parse musical sequences into dependency trees, which are hierarchical structures used in music cognition research and music analysis. The parsing involves two steps. First, the input sequence is passed through a transformer encoder to enrich it with contextual information. Then, a classifier filters the graph of all possible dependency arcs to produce the dependency tree. One major benefit of this system is that it can be easily integrated into modern deep-learning pipelines. Moreover, since it does not rely on any particular symbolic grammar, it can consider multiple musical features simultaneously, make use of sequential context information, and produce partial results for noisy inputs. We test our approach on two datasets of musical trees -- time-span trees of monophonic note sequences and harmonic trees of jazz chord sequences -- and show that our approach outperforms previous methods.
翻译:本文描述了一种数据驱动的框架,用于将音乐序列解析为依存树——音乐认知研究与音乐分析中使用的层次结构。该解析过程包含两个步骤:首先,输入序列通过Transformer编码器以丰富其上下文信息;然后,一个分类器对所有可能的依存弧构成的图进行过滤,生成最终的依存树。本系统的一大优势在于能够轻松集成到现代深度学习流水线中。此外,由于不依赖任何特定的符号语法,它可以同时考虑多种音乐特征,利用序列上下文信息,并为含噪输入生成部分解析结果。我们在两个音乐树数据集(单音音符序列的时间跨度树与爵士和弦序列的和声树)上测试了该方法,结果表明其性能优于此前的方法。