HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond

This paper introduces the HumTrans dataset, which is publicly available and primarily designed for humming melody transcription. The dataset can also serve as a foundation for downstream tasks such as humming melody based music generation. It consists of 500 musical compositions of different genres and languages, with each composition divided into multiple segments. In total, the dataset comprises 1000 music segments. To collect this humming dataset, we employed 10 college students, all of whom are either music majors or proficient in playing at least one musical instrument. Each of them hummed every segment twice using the web recording interface provided by our designed website. The humming recordings were sampled at a frequency of 44,100 Hz. During the humming session, the main interface provides a musical score for students to reference, with the melody audio playing simultaneously to aid in capturing both melody and rhythm. The dataset encompasses approximately 56.22 hours of audio, making it the largest known humming dataset to date. The dataset will be released on Hugging Face, and we will provide a GitHub repository containing baseline results and evaluation codes.

翻译：本文介绍HumTrans数据集，这是一个公开可用的数据集，主要服务于哼唱旋律转录任务。该数据集还可作为基础资源支持下游任务，例如基于哼唱旋律的音乐生成。数据集包含500首不同流派和语言创作的音乐作品，每首作品被划分为多个片段，共计1000个音乐片段。为采集哼唱数据，我们招募了10名大学生，均为音乐专业或精通至少一种乐器。每位学生通过我们设计网站的网页录制接口，对每个片段哼唱两次。哼唱录音采样频率为44,100赫兹。录音过程中，主界面提供乐谱供学生参照，并同步播放旋律音频以辅助捕捉旋律与节奏。数据集总时长约56.22小时，是迄今已知规模最大的哼唱数据集。该数据集将于Hugging Face平台发布，同时我们将提供包含基线结果和评估代码的GitHub仓库。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日