Tomatoes are a major crop worldwide, and accurately classifying their maturity is important for many agricultural applications, such as harvesting, grading, and quality control. In this paper, the authors propose a novel method for tomato maturity classification using a convolutional transformer. The convolutional transformer is a hybrid architecture that combines the strengths of convolutional neural networks (CNNs) and transformers. Additionally, this study introduces a new tomato dataset named KUTomaData, explicitly designed to train deep-learning models for tomato segmentation and classification. KUTomaData is a compilation of images sourced from a greenhouse in the UAE, with approximately 700 images available for training and testing. The dataset is prepared under various lighting conditions and viewing perspectives and employs different mobile camera sensors, distinguishing it from existing datasets. The contributions of this paper are threefold:Firstly, the authors propose a novel method for tomato maturity classification using a modular convolutional transformer. Secondly, the authors introduce a new tomato image dataset that contains images of tomatoes at different maturity levels. Lastly, the authors show that the convolutional transformer outperforms state-of-the-art methods for tomato maturity classification. The effectiveness of the proposed framework in handling cluttered and occluded tomato instances was evaluated using two additional public datasets, Laboro Tomato and Rob2Pheno Annotated Tomato, as benchmarks. The evaluation results across these three datasets demonstrate the exceptional performance of our proposed framework, surpassing the state-of-the-art by 58.14%, 65.42%, and 66.39% in terms of mean average precision scores for KUTomaData, Laboro Tomato, and Rob2Pheno Annotated Tomato, respectively.
翻译:番茄是全球主要农作物之一,准确分类其成熟度对采摘、分级和质量控制等农业应用至关重要。本文提出一种基于卷积Transformer的新型番茄成熟度分类方法。该卷积Transformer是一种融合卷积神经网络(CNN)与Transformer优势的混合架构。此外,本研究引入了一个专为训练番茄分割与分类深度学习模型而设计的新型数据集KUTomaData。该数据集汇编自阿联酋温室的图像,包含约700张训练与测试图像,在多种光照条件、拍摄视角下采集,并采用不同移动相机传感器,与现有数据集形成鲜明对比。本文贡献分为三方面:首先,提出一种基于模块化卷积Transformer的番茄成熟度分类新方法;其次,构建了覆盖不同成熟度番茄图像的新型数据集;最后,证明了卷积Transformer在番茄成熟度分类任务中优于现有最优方法。为评估所提框架在复杂遮挡场景下的有效性,采用Laboro Tomato和Rob2Pheno Annotated Tomato两个公开数据集作为基准。在三个数据集上的评估结果表明,所提框架的平均精度均值分别超越现有最优方法58.14%、65.42%和66.39%(KUTomaData、Laboro Tomato、Rob2Pheno Annotated Tomato)。