Active Robot Vision for Distant Object Change Detection: A Lightweight Training Simulator Inspired by Multi-Armed Bandits

In ground-view object change detection, the recently emerging mapless navigation has great potential to navigate a robot to objects distantly detected (e.g., books, cups, clothes) and acquire high-resolution object images, to identify their change states (no-change/appear/disappear). However, naively performing full journeys for every distant object requires huge sense/plan/action costs, proportional to the number of objects and the robot-to-object distance. To address this issue, we explore a new map-based active vision problem in this work: ``Which journey should the robot select next?" However, the feasibility of the active vision framework remains unclear; Since distant objects are only uncertainly recognized, it is unclear whether they can provide sufficient cues for action planning. This work presents an efficient simulator for feasibility testing, to accelerate the early-stage R&D cycles (e.g., prototyping, training, testing, and evaluation). The proposed simulator is designed to identify the degree of difficulty that a robot vision system (sensors/recognizers/planners/actuators) would face when applied to a given environment (workspace/objects). Notably, it requires only one real-world journey experience per distant object to function, making it suitable for an efficient R&D cycle. Another contribution of this work is to present a new lightweight planner inspired by the traditional multi-armed bandit problem. Specifically, we build a lightweight map-based planner on top of the mapless planner, which constitutes a hierarchical action planner. We verified the effectiveness of the proposed framework using a semantically non-trivial scenario ``sofa as bookshelf".

翻译：在地面视角物体变化检测中，近年来出现的无地图导航技术具有巨大潜力，能够引导机器人前往远距离检测到的物体（如书籍、杯子、衣物），并获取高分辨率物体图像以识别其变化状态（未变化/出现/消失）。然而，为每个远距离物体简单执行完整行程需要巨大的感知/规划/动作代价，且该代价与物体数量及机器人-物体距离成正比。为解决这一问题，本文探索了一个基于地图的新型主动视觉问题：“机器人下一步应选择哪条行程？”但主动视觉框架的可行性尚不明确：由于远距离物体仅能被不确定地识别，尚不清楚它们能否为动作规划提供充分线索。本文提出了一种高效的可行性测试模拟器，以加速早期研发周期（如原型设计、训练、测试与评估）。所提出的模拟器旨在量化机器人视觉系统（传感器/识别器/规划器/执行器）在特定环境（工作空间/物体）中运行时面临的困难程度。值得注意的是，该模拟器仅需每个远距离物体一次真实世界行程经验即可运行，使其适用于高效的研发周期。本文的另一贡献是提出了一种受传统多臂赌博机问题启发的轻量级规划器。具体而言，我们在无地图规划器之上构建了轻量级基于地图的规划器，形成了层次化动作规划架构。我们通过“沙发当书架”这一语义复杂的场景验证了所提框架的有效性。