A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music information retrieval task in ICM as it can aid numerous downstream applications ranging from music recommendations to organizing huge music collections. In this work, we propose a deep learning based approach to Raga recognition. Our approach employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN). We train and test the network on smaller sequences sampled from the original audio while the final inference is performed on the audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during inference on the Comp Music Carnatic dataset and its 10 Raga subset respectively making it the state-of-the-art for the Raga recognition task. Our approach also enables sequence ranking which aids us in retrieving melodic patterns from a given music data base that are closely related to the presented query sequence.
翻译:印度古典音乐(ICM)的核心要素之一是“拉格”(Raga),它作为旋律框架,为乐曲创作与即兴演奏提供基础。拉格识别是ICM领域中一项重要的音乐信息检索任务,能够支持从音乐推荐到海量音乐收藏整理等诸多下游应用。本研究提出了一种基于深度学习的拉格识别方法,该方法采用高效预处理技术,并利用基于长短期记忆的循环神经网络(LSTM-RNN)学习音乐数据中的时序序列。我们从原始音频中采样较小序列进行网络训练与测试,最终对整个音频进行推理。在Comp Music Carnatic数据集及其10个拉格子集上,所提方法在推理阶段分别达到88.1%和97%的准确率,成为拉格识别任务的最新最优方法。此外,本方法还实现了序列排序功能,能够从给定音乐数据库中检索与输入查询序列密切相关的旋律模式。