Infant crying can serve as a crucial indicator of various physiological and emotional states. This paper introduces a comprehensive approach detecting infant cries within audio data. We integrate Wav2Vec with traditional audio features and employ Gradient Boosting Machines for cry classification. We validate our approach on a real world dataset, demonstrating significant performance improvements over existing methods.
翻译:婴儿啼哭可作为多种生理及情绪状态的重要指标。本文提出一种从音频数据中检测婴儿啼哭的综合方法。我们将Wav2Vec与传统音频特征相融合,并采用梯度提升机进行啼哭分类。我们在真实世界数据集上验证了该方法,结果表明其性能较现有方法有显著提升。