Trap-Based Pest Counting: Multiscale and Deformable Attention CenterNet Integrating Internal LR and HR Joint Feature Learning

Pest counting, which predicts the number of pests in the early stage, is very important because it enables rapid pest control, reduces damage to crops, and improves productivity. In recent years, light traps have been increasingly used to lure and photograph pests for pest counting. However, pest images have a wide range of variability in pest appearance owing to severe occlusion, wide pose variation, and even scale variation. This makes pest counting more challenging. To address these issues, this study proposes a new pest counting model referred to as multiscale and deformable attention CenterNet (Mada-CenterNet) for internal low-resolution (LR) and high-resolution (HR) joint feature learning. Compared with the conventional CenterNet, the proposed Mada-CenterNet adopts a multiscale heatmap generation approach in a two-step fashion to predict LR and HR heatmaps adaptively learned to scale variations, that is, changes in the number of pests. In addition, to overcome the pose and occlusion problems, a new between-hourglass skip connection based on deformable and multiscale attention is designed to ensure internal LR and HR joint feature learning and incorporate geometric deformation, thereby resulting in an improved pest counting accuracy. Through experiments, the proposed Mada-CenterNet is verified to generate the HR heatmap more accurately and improve pest counting accuracy owing to multiscale heatmap generation, joint internal feature learning, and deformable and multiscale attention. In addition, the proposed model is confirmed to be effective in overcoming severe occlusions and variations in pose and scale. The experimental results show that the proposed model outperforms state-of-the-art crowd counting and object detection models.

翻译：害虫计数通过预测早期害虫数量，对于快速实施虫害防控、减少作物损失及提升生产力具有重要意义。近年来，灯光诱捕器被广泛用于诱捕并拍摄害虫图像以实现计数。然而，由于严重遮挡、姿态变化幅度大及尺度差异显著，害虫图像存在外观多样性，这给计数任务带来了更大挑战。针对上述问题，本研究提出了一种新型害虫计数模型——多尺度可变形注意力CenterNet（Mada-CenterNet），用于内部低分辨率（LR）与高分辨率（HR）联合特征学习。与传统CenterNet相比，该模型采用两步式多尺度热力图生成方法，自适应学习尺度变化（即害虫数量变化），从而预测LR和HR热力图。此外，为克服姿态与遮挡问题，本研究设计了一种基于可变形与多尺度注意力的沙漏间跳跃连接，确保内部LR与HR联合特征学习并融入几何形变能力，进而提升害虫计数精度。实验证明，Mada-CenterNet通过多尺度热力图生成、内部联合特征学习及可变形多尺度注意力机制，能够更准确地生成HR热力图并提高计数精度。同时，该模型在应对严重遮挡、姿态变化及尺度差异方面表现出有效性。实验结果表明，所提模型优于当前最先进的群体计数与目标检测模型。