Robots responsible for tasks over long time scales must be able to localize consistently and scalably amid geometric, viewpoint, and appearance changes. Existing visual SLAM approaches rely on low-level feature descriptors that are not robust to such environmental changes and result in large map sizes that scale poorly over long-term deployments. In contrast, object detections are robust to environmental variations and lead to more compact representations, but most object-based SLAM systems target short-term indoor deployments with close objects. In this paper, we introduce ObVi-SLAM to overcome these challenges by leveraging the best of both approaches. ObVi-SLAM uses low-level visual features for high-quality short-term visual odometry; and to ensure global, long-term consistency, ObVi-SLAM builds an uncertainty-aware long-term map of persistent objects and updates it after every deployment. By evaluating ObVi-SLAM on data from 16 deployment sessions spanning different weather and lighting conditions, we empirically show that ObVi-SLAM generates accurate localization estimates consistent over long-time scales in spite of varying appearance conditions.
翻译:负责长时间尺度任务的机器人必须能够在几何、视角和外观变化中实现一致且可扩展的定位。现有的视觉SLAM方法依赖于对环境变化不够鲁棒的低层级特征描述符,且会导致地图规模庞大,在长期部署中可扩展性较差。相比之下,物体检测对环境变化具有鲁棒性,并能生成更紧凑的表示,但大多数基于物体的SLAM系统针对的是近距离物体的短期室内部署。本文通过融合两种方法的优势,提出了ObVi-SLAM以克服这些挑战。ObVi-SLAM利用低层级视觉特征实现高质量的短期视觉里程计;同时,为确保全局长期一致性,ObVi-SLAM构建了面向持久物体的不确定性感知长期地图,并在每次部署后对其进行更新。通过在覆盖不同天气和光照条件的16次部署会话数据上评估ObVi-SLAM,我们实验证明:尽管外观条件多变,ObVi-SLAM仍能生成在长时间尺度上保持一致的精确定位估计。