Bangla music is enrich in its own music cultures. Now a days music genre classification is very significant because of the exponential increase in available music, both in digital and physical formats. It is necessary to index them accordingly to facilitate improved retrieval. Automatically classifying Bangla music by genre is essential for efficiently locating specific pieces within a vast and diverse music library. Prevailing methods for genre classification predominantly employ conventional machine learning or deep learning approaches. This work introduces a novel music dataset comprising ten distinct genres of Bangla music. For the task of audio classification, we utilize a recurrent neural network (RNN) architecture. Specifically, a Long Short-Term Memory (LSTM) network is implemented to train the model and perform the classification. Feature extraction represents a foundational stage in audio data processing. This study utilizes Mel-Frequency Cepstral Coefficients (MFCCs) to transform raw audio waveforms into a compact and representative set of features. The proposed framework facilitates music genre classification by leveraging these extracted features. Experimental results demonstrate a classification accuracy of 78%, indicating the system's strong potential to enhance and streamline the organization of Bangla music genres.
翻译:孟加拉语音乐拥有丰富的本土音乐文化。当前,由于数字与实体格式音乐的指数级增长,音乐流派分类变得尤为重要。为提升检索效率,必须建立相应的索引体系。对孟加拉语音乐进行自动流派分类,对于在庞大而多样的音乐库中高效定位特定作品至关重要。现有的流派分类方法主要采用传统机器学习或深度学习技术。本研究构建了一个包含十种孟加拉语音乐流派的新型数据集。针对音频分类任务,我们采用循环神经网络(RNN)架构,具体通过长短期记忆网络(LSTM)进行模型训练与分类。特征提取是音频数据处理的基础环节,本研究利用梅尔频率倒谱系数(MFCC)将原始音频波形转化为紧凑且具有代表性的特征集。所提出的框架基于这些特征实现音乐流派分类。实验结果表明,该系统分类准确率达到78%,展现出优化和简化孟加拉语音乐流派分类体系的巨大潜力。