Advancements in machine learning, computer vision, and robotics have paved the way for transformative solutions in various domains, particularly in agriculture. For example, accurate identification and segmentation of fruits from field images plays a crucial role in automating jobs such as harvesting, disease detection, and yield estimation. However, achieving robust and precise infield fruit segmentation remains a challenging task since large amounts of labeled data are required to handle variations in fruit size, shape, color, and occlusion. In this paper, we develop a few-shot semantic segmentation framework for infield fruits using transfer learning. Concretely, our work is aimed at addressing agricultural domains that lack publicly available labeled data. Motivated by similar success in urban scene parsing, we propose specialized pre-training using a public benchmark dataset for fruit transfer learning. By leveraging pre-trained neural networks, accurate semantic segmentation of fruit in the field is achieved with only a few labeled images. Furthermore, we show that models with pre-training learn to distinguish between fruit still on the trees and fruit that have fallen on the ground, and they can effectively transfer the knowledge to the target fruit dataset.
翻译:机器学习、计算机视觉和机器人技术的进步为各个领域(尤其是农业)的变革性解决方案铺平了道路。例如,从田间图像中准确识别和分割水果,在实现采摘、病害检测和产量估算等自动化任务中发挥着关键作用。然而,由于需要大量标注数据来处理水果尺寸、形状、颜色和遮挡等方面的变化,实现稳健且精确的田间水果分割仍然是一项具有挑战性的任务。本文基于迁移学习,提出了一种面向田间水果的少样本语义分割框架。具体而言,我们的工作旨在解决缺乏公开标注数据的农业领域问题。受城市场景解析领域类似成功案例的启发,我们提出利用公开基准数据集进行专门预训练,以实现水果迁移学习。通过借助预训练神经网络,仅需少量标注图像即可实现田间水果的精确语义分割。此外,研究表明,经过预训练的模型能够学会区分树上果实与落地果实,并能有效地将知识迁移至目标水果数据集。