We present the Batik-plays-Mozart Corpus, a piano performance dataset combining professional Mozart piano sonata performances with expert-labelled scores at a note-precise level. The performances originate from a recording by Viennese pianist Roland Batik on a computer-monitored B\"osendorfer grand piano, and are available both as MIDI files and audio recordings. They have been precisely aligned, note by note, with a current standard edition of the corresponding scores (the New Mozart Edition) in such a way that they can further be connected to the musicological annotations (harmony, cadences, phrases) on these scores that were recently published by Hentschel et al. (2021). The result is a high-quality, high-precision corpus mapping scores and musical structure annotations to precise note-level professional performance information. As the first of its kind, it can serve as a valuable resource for studying various facets of expressive performance and their relationship with structural aspects. In the paper, we outline the curation process of the alignment and conduct two exploratory experiments to demonstrate its usefulness in analyzing expressive performance.
翻译:我们提出“巴蒂克演奏莫扎特语料库”,这是一个结合专业莫扎特钢琴奏鸣曲演奏与专家标注乐谱的钢琴演奏数据集,其标注精度达到音符级别。演奏录音源自维也纳钢琴家罗兰·巴蒂克在计算机监控的贝森朵夫大钢琴上的录制,并以MIDI文件和音频录音两种形式提供。这些演奏已与现行标准版乐谱(新莫扎特版)实现逐音符精确对齐,从而可进一步关联至Hentschel等人(2021年)近期发表的这些乐谱上的音乐学注释(和声、终止式、乐句)。由此生成的高质量、高精度语料库将乐谱与音乐结构注释映射至精确的音符级专业演奏信息。作为同类首个语料库,它可为研究表现性演奏的多维特征及其与结构要素的关系提供宝贵资源。本文概述了对齐过程的整理流程,并开展两项探索性实验以证明其在分析表现性演奏中的实用性。