The scarcity of labelled data makes training Deep Neural Network (DNN) models in bioacoustic applications challenging. In typical bioacoustics applications, manually labelling the required amount of data can be prohibitively expensive. To effectively identify both new and current classes, DNN models must continue to learn new features from a modest amount of fresh data. Active Learning (AL) is an approach that can help with this learning while requiring little labelling effort. Nevertheless, the use of fixed feature extraction approaches limits feature quality, resulting in underutilization of the benefits of AL. We describe an AL framework that addresses this issue by incorporating feature extraction into the AL loop and refining the feature extractor after each round of manual annotation. In addition, we use raw audio processing rather than spectrograms, which is a novel approach. Experiments reveal that the proposed AL framework requires 14.3%, 66.7%, and 47.4% less labelling effort on benchmark audio datasets ESC-50, UrbanSound8k, and InsectWingBeat, respectively, for a large DNN model and similar savings on a microcontroller-based counterpart. Furthermore, we showcase the practical relevance of our study by incorporating data from conservation biology projects.
翻译:标记数据的稀缺性使得在生物声学应用中训练深度神经网络(DNN)模型面临挑战。在典型的生物声学应用中,手动标注所需数量的数据往往成本高昂,难以承受。为了有效识别新类别和现有类别,DNN模型必须持续从少量新数据中学习新特征。主动学习(AL)是一种能够以较少标注工作量辅助此类学习的方法。然而,固定特征提取方法的使用限制了特征质量,导致AL优势未能充分发挥。我们提出了一种AL框架,通过将特征提取纳入AL循环,并在每次手动标注后优化特征提取器,解决了这一问题。此外,我们采用原始音频处理而非频谱图,这是一种新颖的方法。实验表明,所提出的AL框架在基准音频数据集ESC-50、UrbanSound8k和InsectWingBeat上,针对大型DNN模型分别减少了14.3%、66.7%和47.4%的标注工作量,并且在基于微控制器的对应模型上也实现了类似的节省。此外,我们通过整合保护生物学项目的数据,展示了本研究的实际应用价值。