The Three-River-Source region is a highly significant natural reserve in China that harbors a plethora of untamed botanical resources. To meet the practical requirements of botanical research and intelligent plant management, we construct a large-scale dataset for Plant detection in the Three-River-Source region (PTRS). This dataset comprises 6965 high-resolution images of 2160*3840 pixels, captured by diverse sensors and platforms, and featuring objects of varying shapes and sizes. Subsequently, a team of botanical image interpretation experts annotated these images with 21 commonly occurring object categories. The fully annotated PTRS images contain 122, 300 instances of plant leaves, each labeled by a horizontal rectangle. The PTRS presents us with challenges such as dense occlusion, varying leaf resolutions, and high feature similarity among plants, prompting us to develop a novel object detection network named PlantDet. This network employs a window-based efficient self-attention module (ST block) to generate robust feature representation at multiple scales, improving the detection efficiency for small and densely-occluded objects. Our experimental results validate the efficacy of our proposed plant detection benchmark, with a precision of 88.1%, a mean average precision (mAP) of 77.6%, and a higher recall compared to the baseline. Additionally, our method effectively overcomes the issue of missing small objects. We intend to share our data and code with interested parties to advance further research in this field.
翻译:摘要:三江源地区是中国极其重要的自然保护区,蕴藏着大量野生植物资源。为满足植物学研究与智能植物管理的实际需求,我们构建了一个面向三江源地区植物检测的大规模数据集(PTRS)。该数据集包含6965张2160×3840像素的高分辨率图像,由多种传感器与平台采集,涵盖不同形状与尺寸的物体。随后,由植物图像解译专家团队对这些图像进行标注,共涉及21个常见物体类别。完全标注的PTRS图像包含122,300个植物叶片实例,每个实例均采用水平矩形框进行标注。PTRS数据集面临密集遮挡、叶片分辨率差异及植物间特征高度相似等挑战,促使我们提出一种新型目标检测网络PlantDet。该网络采用基于窗口的高效自注意力模块(ST块),生成多尺度鲁棒特征表示,从而提升对小尺寸及密集遮挡目标的检测效率。实验结果表明,我们提出的植物检测基准数据集具有有效性,其精确率达88.1%,平均精度均值(mAP)达77.6%,且召回率优于基线方法。此外,该方法有效克服了小目标漏检问题。我们计划向相关研究人员共享数据与代码,以推动该领域的进一步研究。