This paper, a technical summary of our preceding publication, introduces a robust machine learning framework for the detection of vocal activities of Coppery titi monkeys. Utilizing a combination of MFCC features and a bidirectional LSTM-based classifier, we effectively address the challenges posed by the small amount of expert-annotated vocal data available. Our approach significantly reduces false positives and improves the accuracy of call detection in bioacoustic research. Initial results demonstrate an accuracy of 95\% on instance predictions, highlighting the effectiveness of our model in identifying and classifying complex vocal patterns in environmental audio recordings. Moreover, we show how call classification can be done downstream, paving the way for real-world monitoring.
翻译:本文作为我们先前发表论文的技术总结,介绍了一种用于检测铜色蒂蒂猴发声活动的稳健机器学习框架。通过结合使用MFCC特征和基于双向LSTM的分类器,我们有效解决了专家标注的发声数据量有限所带来的挑战。该方法显著降低了误报率,并提高了生物声学研究中的叫声检测准确度。初步结果显示实例预测准确率达到95%,凸显了我们模型在环境音频记录中识别和分类复杂发声模式的有效性。此外,我们展示了如何进行下游的叫声分类,为实际环境监测铺平了道路。