The Three-River-Source region is a highly significant natural reserve in China that harbors a plethora of untamed botanical resources. To meet the practical requirements of botanical research and intelligent plant management, we construct a large-scale dataset for Plant detection in the Three-River-Source region (PTRS). This dataset comprises 6965 high-resolution images of 2160*3840 pixels, captured by diverse sensors and platforms, and featuring objects of varying shapes and sizes. Subsequently, a team of botanical image interpretation experts annotated these images with 21 commonly occurring object categories. The fully annotated PTRS images contain 122, 300 instances of plant leaves, each labeled by a horizontal rectangle. The PTRS presents us with challenges such as dense occlusion, varying leaf resolutions, and high feature similarity among plants, prompting us to develop a novel object detection network named PlantDet. This network employs a window-based efficient self-attention module (ST block) to generate robust feature representation at multiple scales, improving the detection efficiency for small and densely-occluded objects. Our experimental results validate the efficacy of our proposed plant detection benchmark, with a precision of 88.1%, a mean average precision (mAP) of 77.6%, and a higher recall compared to the baseline. Additionally, our method effectively overcomes the issue of missing small objects. We intend to share our data and code with interested parties to advance further research in this field.
翻译:三江源地区是中国极为重要的自然保护区,蕴藏着丰富的野生植物资源。为满足植物学研究和智能植物管理的实际需求,我们构建了大规模的三江源植物检测数据集(PTRS)。该数据集包含6965张分辨率为2160×3840像素的高清图像,由多种传感器和平台拍摄,覆盖不同形状和大小的目标物体。随后,植物图像解译专家团队对图像中21种常见类别进行了标注。完全标注的PTRS图像中共包含122300个植物叶片实例,每个实例均通过水平矩形框标注。PTRS数据集存在密集遮挡、叶片分辨率差异大、植物间特征高度相似等挑战,促使我们提出名为PlantDet的新型目标检测网络。该网络采用基于窗口的高效自注意力模块(ST模块),在多个尺度上生成鲁棒特征表示,有效提升小目标及密集遮挡目标的检测效率。实验结果表明,我们提出的植物检测基准数据集性能优异:精确率达88.1%,平均精度均值(mAP)达77.6%,且召回率较基线方法更高。此外,该方法有效解决了小目标漏检问题。我们计划向感兴趣的研究者共享数据和代码,以推动该领域的进一步研究。