Crop row detection enables autonomous robots to navigate in gps denied environments. Vision based strategies often struggle in the environments due to gaps, curved crop rows and require post-processing steps. Furthermore, labeling crop rows in under the canopy environments accurately is very difficult due to occlusions. This study introduces RowDetr, an efficient end-to-end transformer-based neural network for crop row detection in precision agriculture. RowDetr leverages a lightweight backbone and a hybrid encoder to model straight, curved, or occluded crop rows with high precision. Central to the architecture is a novel polynomial representation that enables direct parameterization of crop rows, eliminating computationally expensive post-processing. Key innovations include a PolySampler module and multi-scale deformable attention, which work together with PolyOptLoss, an energy-based loss function designed to optimize geometric alignment between predicted and the annotated crop rows, while also enhancing robustness against labeling noise. RowDetr was evaluated against other state-of-the-art end-to-end crop row detection methods like AgroNav and RolColAttention on a diverse dataset of 6,962 high-resolution images, used for training, validation, and testing across multiple crop types with annotated crop rows. The system demonstrated superior performance, achieved an F1 score up to 0.74 and a lane position deviation as low as 0.405. Furthermore, RowDetr achieves a real-time inference latency of 6.7ms, which was optimized to 3.5ms with INT8 quantization on an NVIDIA Jetson Orin AGX. This work highlighted the critical efficiency of polynomial parameterization, making RowDetr particularly suitable for deployment on edge computing devices in agricultural robotics and autonomous farming equipment. Index terms > Crop Row Detection, Under Canopy Navigation, Transformers, RT-DETR, RT-DETRv2
翻译:作物行检测使自主机器人能够在GPS拒止环境中导航。基于视觉的策略常因作物行间隙、弯曲以及需要后处理步骤而在这些环境中表现不佳。此外,由于遮挡,在冠层下环境中精确标注作物行非常困难。本研究提出了RowDetr,一种用于精准农业中作物行检测的高效端到端基于Transformer的神经网络。RowDetr利用轻量级骨干网络和混合编码器,以高精度建模直线、弯曲或被遮挡的作物行。该架构的核心是一种新颖的多项式表示法,可直接参数化作物行,从而消除了计算成本高昂的后处理过程。关键创新包括PolySampler模块和多尺度可变形注意力机制,它们与PolyOptLoss协同工作——PolyOptLoss是一种基于能量的损失函数,旨在优化预测作物行与标注作物行之间的几何对齐,同时增强对标注噪声的鲁棒性。RowDetr在包含6,962张高分辨率图像的多样化数据集上,与AgroNav和RolColAttention等其他先进的端到端作物行检测方法进行了对比评估。该数据集涵盖多种作物类型并带有标注的作物行,用于训练、验证和测试。该系统表现出卓越的性能,F1分数最高达到0.74,车道位置偏差低至0.405。此外,RowDetr实现了6.7毫秒的实时推理延迟,在NVIDIA Jetson Orin AGX上通过INT8量化可进一步优化至3.5毫秒。这项工作凸显了多项式参数化在效率上的关键优势,使得RowDetr特别适合部署在农业机器人和自主农业设备的边缘计算设备上。索引词 > 作物行检测,冠层下导航,Transformers,RT-DETR,RT-DETRv2