Passive acoustic monitoring enables continuous, non-invasive biodiversity assessment across diverse ecosystems. The scale of these datasets has driven the adoption of machine learning, with supervised approaches showing strong performance. However, supervised methods require time-resolved annotated datasets, which remain scarce, especially in complex tropical soundscapes. We present PteroSet, a curated dataset of strongly annotated Neotropical bird vocalizations recorded in Puerto Asis (Putumayo) and Pivijay (Magdalena), Colombia, between 2023 and 2025. The dataset comprises 563 recordings (73.62 h) and 15,372 time-frequency annotations, including 6,702 events identified to the species level across 168 species. We release the annotations in a COCO-inspired JSON schema that unifies audio files, taxonomic categories, and labels for machine learning workflows. Beyond providing annotated data, PteroSet serves as a realistic benchmark that highlights key characteristics of tropical soundscapes, including acoustic co-occurrence and domain shift across recording sites. We provide a deep learning baseline for binary bird detection, demonstrating PteroSet's usability and the challenges it presents.
翻译:被动声学监测能够实现跨多样化生态系统的连续、非侵入性生物多样性评估。此类数据集的规模推动了机器学习技术的应用,其中监督学习方法表现出色。然而,监督方法需要具备时间分辨率的标注数据集,这类数据在复杂的热带声景中仍然稀缺。我们提出PteroSet,这是一套精选的强标注新热带鸟类鸣声数据集,记录于哥伦比亚普图马约省的普埃尔托阿西斯与马格达莱纳省的皮维哈伊(2023-2025年)。该数据集包含563段录音(总计73.62小时)及15,372条时频标注,其中6,702个事件已鉴定至物种水平,涵盖168个物种。我们采用受COCO启发的JSON模式发布标注,该模式统一了音频文件、分类学类别及适用于机器学习工作流的标签。除提供标注数据外,PteroSet还可作为现实基准,突出热带声景的关键特征,包括声学共现及跨录音位点的域偏移。我们建立了二元鸟类检测的深度学习基线,验证了PteroSet的实用性及其所呈现的挑战。