Insects represent half of all global biodiversity, yet many of the world's insects are disappearing, with severe implications for ecosystems and agriculture. Despite this crisis, data on insect diversity and abundance remain woefully inadequate, due to the scarcity of human experts and the lack of scalable tools for monitoring. Ecologists have started to adopt camera traps to record and study insects, and have proposed computer vision algorithms as an answer for scalable data processing. However, insect monitoring in the wild poses unique challenges that have not yet been addressed within computer vision, including the combination of long-tailed data, extremely similar classes, and significant distribution shifts. We provide the first large-scale machine learning benchmarks for fine-grained insect recognition, designed to match real-world tasks faced by ecologists. Our contributions include a curated dataset of images from citizen science platforms and museums, and an expert-annotated dataset drawn from automated camera traps across multiple continents, designed to test out-of-distribution generalization under field conditions. We train and evaluate a variety of baseline algorithms and introduce a combination of data augmentation techniques that enhance generalization across geographies and hardware setups. Code and datasets are made publicly available.
翻译:昆虫占全球生物多样性的一半,然而全球许多昆虫正在消失,这对生态系统和农业产生了严重影响。尽管面临这一危机,由于人类专家的稀缺和缺乏可扩展的监测工具,关于昆虫多样性和丰度的数据仍然严重不足。生态学家已开始采用相机陷阱来记录和研究昆虫,并提出了计算机视觉算法作为可扩展数据处理的解决方案。然而,野外昆虫监测带来了计算机视觉领域尚未应对的独特挑战,包括长尾数据分布、极度相似的类别以及显著的数据分布偏移。我们首次提供了面向细粒度昆虫识别的大规模机器学习基准,旨在匹配生态学家面临的真实世界任务。我们的贡献包括:一个从公民科学平台和博物馆收集的精选图像数据集,以及一个从跨多个大陆的自动化相机陷阱中提取、经专家标注的数据集,该数据集专为测试野外条件下的分布外泛化能力而设计。我们训练并评估了多种基线算法,并引入了一系列数据增强技术组合,以提升跨地域和硬件设置的泛化性能。代码与数据集均已公开提供。