In greenhouse tomato production, automated harvesting requires accurate detection of ripe tomatoes, ripeness classification, and precise picking-point localization for robotic end-effectors. This paper proposes YOLO26-RipeLoc Lite, a lightweight deep learning architecture based on YOLO26 for simultaneous detection, ripeness classification, and center-point localization of greenhouse tomatoes. The model introduces three modifications: (1) a Lightweight Feature Pyramid Network (LFPN) with depthwise separable convolutions for efficient multi-scale fusion, (2) a Ripeness-Aware Attention Module (RAAM) with dual pooling and a learnable ripeness bias vector for enhanced color-texture discrimination, and (3) a Compact Detection Head (CDH) with shared convolutions and an integrated center-point regression branch for direct grasp planning. The model is evaluated on a custom dataset of 1,500 images with 6,227 instances (3,566 ripe, 2,661 unripe) from the SILAL greenhouse, Abu Dhabi, UAE. YOLO26-RipeLoc Lite achieves [email protected] of 92.9% (95.2% ripe, 90.6% unripe) with the highest precision (95.2%) among all evaluated architectures using only 2.38M parameters. Post-training BatchNorm pruning at 30% reduces parameters to ~1.8M with negligible accuracy loss. Ablation studies confirm that greenhouse-aware HSV augmentation provides the largest improvement (+2.02 pp mAP@50), backbone freezing achieves peak precision (93.8%), and 3-phase progressive unfreezing yields the best localization quality (mAP@50:95 of 64.6%). Comparisons with YOLOv8n/s, YOLO11n/s, YOLO12n/s, and YOLO26s confirm superior accuracy-efficiency: 2.9 pp higher precision than YOLO12n with 7.0% fewer parameters and integrated center-point localization for robotic end-effector guidance.
翻译:在温室番茄生产中,自动化采摘需要实现对成熟番茄的准确检测、成熟度分类以及机械臂末端执行器的精确保采摘点定位。本文提出YOLO26-RipeLoc Lite,一种基于YOLO26的轻量级深度学习架构,用于温室番茄的同步检测、成熟度分类与中心点定位。该模型引入三项改进:(1) 采用深度可分离卷积的轻量级特征金字塔网络 (LFPN),实现高效多尺度融合;(2) 带有双池化与可学习成熟度偏置向量的成熟度感知注意力模块 (RAAM),增强颜色-纹理判别能力;(3) 采用共享卷积与集成中心点回归分支的紧凑检测头 (CDH),实现直接抓取规划。模型在来自阿联酋阿布扎比SILAL温室的自建数据集(含1500张图像、6227个实例,其中成熟番茄3566个、未成熟番茄2661个)上进行了评估。YOLO26-RipeLoc Lite在使用仅238万参数的情况下,实现了92.9%的[email protected](成熟类95.2%,未成熟类90.6%),且在评估的所有架构中取得最高精度(95.2%)。30%训练后批归一化剪枝可将参数减少至约180万,且精度损失可忽略。消融实验证实,温室感知的HSV增强带来最大改进(mAP@50提升+2.02个百分点),骨干网络冻结达到峰值精度(93.8%),三阶段渐进解冻获得最佳定位质量(mAP@50:95为64.6%)。与YOLOv8n/s、YOLO11n/s、YOLO12n/s及YOLO26s的对比验证了其优越的精度-效率平衡:相比YOLO12n,精度提升2.9个百分点,参数量减少7.0%,并集成了用于机械臂末端执行器引导的中心点定位功能。