The problem of multi-object tracking (MOT) consists in detecting and tracking all the objects in a video sequence while keeping a unique identifier for each object. It is a challenging and fundamental problem for robotics. In precision agriculture the challenge of achieving a satisfactory solution is amplified by extreme camera motion, sudden illumination changes, and strong occlusions. Most modern trackers rely on the appearance of objects rather than motion for association, which can be ineffective when most targets are static objects with the same appearance, as in the agricultural case. To this end, on the trail of SORT [5], we propose AgriSORT, a simple, online, real-time tracking-by-detection pipeline for precision agriculture based only on motion information that allows for accurate and fast propagation of tracks between frames. The main focuses of AgriSORT are efficiency, flexibility, minimal dependencies, and ease of deployment on robotic platforms. We test the proposed pipeline on a novel MOT benchmark specifically tailored for the agricultural context, based on video sequences taken in a table grape vineyard, particularly challenging due to strong self-similarity and density of the instances. Both the code and the dataset are available for future comparisons.
翻译:摘要:多目标跟踪(MOT)问题旨在检测并追踪视频序列中的所有物体,同时为每个物体保持唯一标识符。这是机器人领域的一项基础性挑战。在精准农业中,极端相机运动、光照突变和严重遮挡等因素进一步加剧了实现满意解决方案的难度。大多数现代跟踪器依赖物体外观而非运动信息进行关联,但当多数目标为外观相同的静态物体(如农业场景)时,这种方法的有效性会显著下降。为此,我们沿袭SORT[5]的思路,提出AgriSORT——一种基于纯运动信息的简单、在线、实时检测跟踪框架,专门用于精准农业场景,能够实现帧间轨迹的精确快速传播。AgriSORT的核心关注点包括:高效性、灵活性、最小依赖性和易于在机器人平台部署。我们在专为农业场景设计的新型MOT基准上测试该框架,该基准采用在鲜食葡萄园中采集的视频序列,因目标高度自相似性和密集分布而极具挑战性。代码和数据集均已公开,供后续研究对比。