Weakly and Semi-Supervised Detection, Segmentation and Tracking of Table Grapes with Limited and Noisy Data

Detection, segmentation and tracking of fruits and vegetables are three fundamental tasks for precision agriculture, enabling robotic harvesting and yield estimation applications. However, modern algorithms are data hungry and it is not always possible to gather enough data to apply the best performing supervised approaches. Since data collection is an expensive and cumbersome task, the enabling technologies for using computer vision in agriculture are often out of reach for small businesses. Following previous work in this context, where we proposed an initial weakly supervised solution to reduce the data needed to get state-of-the-art detection and segmentation in precision agriculture applications, here we improve that system and explore the problem of tracking fruits in orchards. We present the case of vineyards of table grapes in southern Lazio (Italy) since grapes are a difficult fruit to segment due to occlusion, color and general illumination conditions. We consider the case in which there is some initial labelled data that could work as source data (\eg wine grape data), but it is considerably different from the target data (e.g. table grape data). To improve detection and segmentation on the target data, we propose to train the segmentation algorithm with a weak bounding box label, while for tracking we leverage 3D Structure from Motion algorithms to generate new labels from already labelled samples. Finally, the two systems are combined in a full semi-supervised approach. Comparisons with state-of-the-art supervised solutions show how our methods are able to train new models that achieve high performances with few labelled images and with very simple labelling.

翻译：检测、分割与追踪水果与蔬菜是精准农业中的三项基础任务，对实现机器人采摘和产量估算至关重要。然而，现代算法对数据需求量大，且往往无法采集足够数据以应用性能最优的监督方法。由于数据收集成本高昂且费时费力，小企业通常难以负担农业计算机视觉应用所需的使能技术。基于前期在该领域提出的一种初步弱监督解决方案（该方案可减少在精准农业应用中实现先进检测与分割所需的数据量），本研究对该系统进行改进，并探讨果园中水果追踪问题。以意大利拉齐奥南部鲜食葡萄园为例，这是因为葡萄因遮挡、颜色及光照条件差异而成为分割难度较大的水果。我们考虑初始标注数据可作为源数据（如酿酒葡萄数据）、但与目标数据（如鲜食葡萄数据）存在显著差异的场景。为提升目标数据的检测与分割性能，我们提出使用弱标注边界框训练分割算法；追踪方面则利用三维运动恢复结构算法从已标注样本生成新标签。最终将两个系统整合为完整的半监督方法。与现有监督方法的对比表明，本方法仅需少量标注图像和极简标注即可训练出高性能新模型。