Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering

This study presents a machine learning framework for assessing similarity between audio content and predicting sentiment score. We construct a dataset containing audio samples from music covers on YouTube along with the audio of the original song, and sentiment scores derived from user comments, serving as proxy labels for content quality. Our approach involves extensive pre-processing, segmenting audio signals into 30-second windows, and extracting high-dimensional feature representations through Mel-Frequency Cepstral Coefficients (MFCC), Chroma, Spectral Contrast, and Temporal characteristics. Leveraging these features, we train regression models to predict sentiment scores on a 0-100 scale, achieving root mean square error (RMSE) values of 3.420, 5.482, 2.783, and 4.212, respectively. Improvements over a baseline model based on absolute difference metrics are observed. These results demonstrate the potential of machine learning to capture sentiment and similarity in audio, offering an adaptable framework for AI applications in media analysis.

翻译：本研究提出一种用于评估音频内容相似度并预测情感得分的机器学习框架。我们构建了一个数据集，包含来自YouTube音乐翻唱作品的音频样本、原始歌曲音频，以及从用户评论中提取的情感得分（作为内容质量的代理标签）。我们的方法包括大量预处理步骤：将音频信号分割为30秒窗口，并通过梅尔频率倒谱系数（MFCC）、色度特征、谱对比度特征和时序特征提取高维特征表示。基于这些特征，我们训练回归模型以预测0-100范围内的情感得分，分别获得了3.420、5.482、2.783和4.212的均方根误差（RMSE）值。相较于基于绝对差异度量的基线模型，本方法取得了改进。这些结果表明机器学习在捕捉音频情感与相似度方面的潜力，为媒体分析领域的AI应用提供了一个可适配的框架。

相关内容

Machine Learning

关注 2249

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日