Autonomous aerial harvesting is a highly complex problem because it requires numerous interdisciplinary algorithms to be executed on mini low-powered computing devices. Object detection is one such algorithm that is compute-hungry. In this context, we make the following contributions: (i) Fast Fruit Detector (FFD), a resource-efficient, single-stage, and postprocessing-free object detector based on our novel latent object representation (LOR) module, query assignment, and prediction strategy. FFD achieves 100FPS@FP32 precision on the latest 10W NVIDIA Jetson-NX embedded device while co-existing with other time-critical sub-systems such as control, grasping, SLAM, a major achievement of this work. (ii) a method to generate vast amounts of training data without exhaustive manual labelling of fruit images since they consist of a large number of instances, which increases the labelling cost and time. (iii) an open-source fruit detection dataset having plenty of very small-sized instances that are difficult to detect. Our exhaustive evaluations on our and MinneApple dataset show that FFD, being only a single-scale detector, is more accurate than many representative detectors, e.g. FFD is better than single-scale Faster-RCNN by 10.7AP, multi-scale Faster-RCNN by 2.3AP, and better than latest single-scale YOLO-v8 by 8AP and multi-scale YOLO-v8 by 0.3 while being considerably faster.
翻译:自主空中采摘是一个高度复杂的问题,因为它需要大量跨学科算法在低功耗小型计算设备上运行。目标检测正是其中一种计算密集型的算法。在此背景下,我们做出以下贡献:(i)快速水果检测器(FFD),这是一种基于我们新颖的潜目标表示(LOR)模块、查询分配和预测策略的资源高效、单阶段且无需后处理的目标检测器。FFD在最新的10W NVIDIA Jetson-NX嵌入式设备上实现了100FPS@FP32精度,并与其他时间关键性子系统(如控制、抓取、SLAM)共存,这是本项工作的重大成就。(ii)一种无需对水果图像进行详尽人工标注即可生成大量训练数据的方法,因为水果图像包含大量实例,这增加了标注成本和时间。(iii)一个开源水果检测数据集,其中包含大量难以检测的极小尺寸实例。我们在自己的数据集和MinneApple数据集上进行的详尽评估表明,FFD作为单尺度检测器,比许多代表性检测器更准确,例如FFD比单尺度Faster-RCNN高出10.7AP,比多尺度Faster-RCNN高出2.3AP,比最新单尺度YOLO-v8高出8AP,比多尺度YOLO-v8高出0.3AP,同时速度显著更快。