As most ''in the wild'' data collections of the natural world, the North America Camera Trap Images (NACTI) dataset shows severe long-tailed class imbalance, noting that the largest 'Head' class alone covers >50% of the 3.7M images in the corpus. Building on the PyTorch Wildlife model, we present a systematic study of Long-Tail Recognition methodologies for species recognition on the NACTI dataset covering experiments on various LTR loss functions plus LTR-sensitive regularisation. Our best configuration achieves 99.40% Top-1 accuracy on our NACTI test data split, substantially improving over a 95.51% baseline using standard cross-entropy with Adam. This also improves on previously reported top performance in MLWIC2 at 96.8% albeit using partly unpublished (potentially different) partitioning, optimiser, and evaluation protocols. To evaluate domain shifts (e.g. night-time captures, occlusion, motion-blur) towards other datasets we construct a Reduced-Bias Test set from the ENA-Detection dataset where our experimentally optimised long-tail enhanced model achieves leading 52.55% accuracy (up from 51.20% with WCE loss), demonstrating stronger generalisation capabilities under distribution shift. We document the consistent improvements of LTR-enhancing scheduler choices in this NACTI wildlife domain, particularly when in tandem with state-of-the-art LTR losses. We finally discuss qualitative and quantitative shortcomings that LTR methods cannot sufficiently address, including catastrophic breakdown for 'Tail' classes under severe domain shift. For maximum reproducibility we publish all dataset splits, key code, and full network weights.
翻译:与大多数自然界的"野外"数据收集类似,北美相机陷阱图像(NACTI)数据集呈现出严重的类别长尾不平衡现象,其中最大的"头部"类别单独覆盖了该语料库370万张图像中超过50%的数据。基于PyTorch Wildlife模型,我们对NACTI数据集上的物种识别任务进行了长尾识别方法的系统性研究,涵盖了多种LTR损失函数及LTR敏感正则化的实验。我们的最佳配置在NACTI测试数据划分上达到了99.40%的Top-1准确率,相较于使用Adam优化器的标准交叉熵基线(95.51%)有显著提升。尽管采用了部分未公开(可能不同)的数据划分、优化器和评估协议,该结果也超越了先前MLWIC2中报告的最佳性能(96.8%)。为评估向其他数据集的领域偏移(如夜间拍摄、遮挡、运动模糊),我们从ENA-Detection数据集构建了减偏测试集,在该测试集上我们通过实验优化的长尾增强模型取得了领先的52.55%准确率(较使用WCE损失的51.20%有所提升),展示了在分布偏移下更强的泛化能力。我们记录了LTR增强调度器选择在NACTI野生动物领域的持续改进效果,特别是与最先进的LTR损失函数结合使用时。最后我们讨论了LTR方法无法充分解决的定性与定量缺陷,包括在严重领域偏移下"尾部"类别的灾难性性能崩溃。为确保最大可复现性,我们公开了所有数据集划分、关键代码及完整的网络权重。