YOLO26-RipeLoc Lite: A lightweight architecture for tomato ripeness detection and picking point localization in greenhouse robotic harvesting

In greenhouse tomato production, automated harvesting requires accurate detection of ripe tomatoes, ripeness classification, and precise picking-point localization for robotic end-effectors. This paper proposes YOLO26-RipeLoc Lite, a lightweight deep learning architecture based on YOLO26 for simultaneous detection, ripeness classification, and center-point localization of greenhouse tomatoes. The model introduces three modifications: (1) a Lightweight Feature Pyramid Network (LFPN) with depthwise separable convolutions for efficient multi-scale fusion, (2) a Ripeness-Aware Attention Module (RAAM) with dual pooling and a learnable ripeness bias vector for enhanced color-texture discrimination, and (3) a Compact Detection Head (CDH) with shared convolutions and an integrated center-point regression branch for direct grasp planning. The model is evaluated on a custom dataset of 1,500 images with 6,227 instances (3,566 ripe, 2,661 unripe) from the SILAL greenhouse, Abu Dhabi, UAE. YOLO26-RipeLoc Lite achieves [email protected] of 92.9% (95.2% ripe, 90.6% unripe) with the highest precision (95.2%) among all evaluated architectures using only 2.38M parameters. Post-training BatchNorm pruning at 30% reduces parameters to ~1.8M with negligible accuracy loss. Ablation studies confirm that greenhouse-aware HSV augmentation provides the largest improvement (+2.02 pp mAP@50), backbone freezing achieves peak precision (93.8%), and 3-phase progressive unfreezing yields the best localization quality (mAP@50:95 of 64.6%). Comparisons with YOLOv8n/s, YOLO11n/s, YOLO12n/s, and YOLO26s confirm superior accuracy-efficiency: 2.9 pp higher precision than YOLO12n with 7.0% fewer parameters and integrated center-point localization for robotic end-effector guidance.

翻译：在温室番茄生产中，自动化采摘需要实现对成熟番茄的准确检测、成熟度分类以及机械臂末端执行器的精确保采摘点定位。本文提出YOLO26-RipeLoc Lite，一种基于YOLO26的轻量级深度学习架构，用于温室番茄的同步检测、成熟度分类与中心点定位。该模型引入三项改进：(1) 采用深度可分离卷积的轻量级特征金字塔网络 (LFPN)，实现高效多尺度融合；(2) 带有双池化与可学习成熟度偏置向量的成熟度感知注意力模块 (RAAM)，增强颜色-纹理判别能力；(3) 采用共享卷积与集成中心点回归分支的紧凑检测头 (CDH)，实现直接抓取规划。模型在来自阿联酋阿布扎比SILAL温室的自建数据集（含1500张图像、6227个实例，其中成熟番茄3566个、未成熟番茄2661个）上进行了评估。YOLO26-RipeLoc Lite在使用仅238万参数的情况下，实现了92.9%的[email protected]（成熟类95.2%，未成熟类90.6%），且在评估的所有架构中取得最高精度（95.2%）。30%训练后批归一化剪枝可将参数减少至约180万，且精度损失可忽略。消融实验证实，温室感知的HSV增强带来最大改进（mAP@50提升+2.02个百分点），骨干网络冻结达到峰值精度（93.8%），三阶段渐进解冻获得最佳定位质量（mAP@50:95为64.6%）。与YOLOv8n/s、YOLO11n/s、YOLO12n/s及YOLO26s的对比验证了其优越的精度-效率平衡：相比YOLO12n，精度提升2.9个百分点，参数量减少7.0%，并集成了用于机械臂末端执行器引导的中心点定位功能。