北美相机陷阱图像数据集中的长尾物种识别 (Long-tailed Species Recognition in the NACTI Wildlife Dataset)

As most ''in the wild'' data collections of the natural world, the North America Camera Trap Images (NACTI) dataset shows severe long-tailed class imbalance, noting that the largest 'Head' class alone covers >50% of the 3.7M images in the corpus. Building on the PyTorch Wildlife model, we present a systematic study of Long-Tail Recognition methodologies for species recognition on the NACTI dataset covering experiments on various LTR loss functions plus LTR-sensitive regularisation. Our best configuration achieves 99.40% Top-1 accuracy on our NACTI test data split, substantially improving over a 95.51% baseline using standard cross-entropy with Adam. This also improves on previously reported top performance in MLWIC2 at 96.8% albeit using partly unpublished (potentially different) partitioning, optimiser, and evaluation protocols. To evaluate domain shifts (e.g. night-time captures, occlusion, motion-blur) towards other datasets we construct a Reduced-Bias Test set from the ENA-Detection dataset where our experimentally optimised long-tail enhanced model achieves leading 52.55% accuracy (up from 51.20% with WCE loss), demonstrating stronger generalisation capabilities under distribution shift. We document the consistent improvements of LTR-enhancing scheduler choices in this NACTI wildlife domain, particularly when in tandem with state-of-the-art LTR losses. We finally discuss qualitative and quantitative shortcomings that LTR methods cannot sufficiently address, including catastrophic breakdown for 'Tail' classes under severe domain shift. For maximum reproducibility we publish all dataset splits, key code, and full network weights.

翻译：与大多数自然界的"野外"数据收集类似，北美相机陷阱图像(NACTI)数据集呈现出严重的类别长尾不平衡现象，其中最大的"头部"类别单独覆盖了该语料库370万张图像中超过50%的数据。基于PyTorch Wildlife模型，我们对NACTI数据集上的物种识别任务进行了长尾识别方法的系统性研究，涵盖了多种LTR损失函数及LTR敏感正则化的实验。我们的最佳配置在NACTI测试数据划分上达到了99.40%的Top-1准确率，相较于使用Adam优化器的标准交叉熵基线（95.51%）有显著提升。尽管采用了部分未公开（可能不同）的数据划分、优化器和评估协议，该结果也超越了先前MLWIC2中报告的最佳性能（96.8%）。为评估向其他数据集的领域偏移（如夜间拍摄、遮挡、运动模糊），我们从ENA-Detection数据集构建了减偏测试集，在该测试集上我们通过实验优化的长尾增强模型取得了领先的52.55%准确率（较使用WCE损失的51.20%有所提升），展示了在分布偏移下更强的泛化能力。我们记录了LTR增强调度器选择在NACTI野生动物领域的持续改进效果，特别是与最先进的LTR损失函数结合使用时。最后我们讨论了LTR方法无法充分解决的定性与定量缺陷，包括在严重领域偏移下"尾部"类别的灾难性性能崩溃。为确保最大可复现性，我们公开了所有数据集划分、关键代码及完整的网络权重。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日