Sequentially interacting with articulated objects is crucial for a mobile manipulator to operate effectively in everyday environments. To enable long-horizon tasks involving articulated objects, this study explores building scene-level articulation models for indoor scenes through autonomous exploration. While previous research has studied mobile manipulation with articulated objects by considering object kinematic constraints, it primarily focuses on individual-object scenarios and lacks extension to a scene-level context for task-level planning. To manipulate multiple object parts sequentially, the robot needs to reason about the resultant motion of each part and anticipate its impact on future actions.We introduce \ourtool{}, a full-stack approach for long-horizon manipulation tasks with articulated objects. The robot maps the scene, detects and physically interacts with articulated objects, collects observations, and infers the articulation properties. For sequential tasks, the robot plans a feasible series of object interactions based on the inferred articulation model. We demonstrate that our approach repeatably constructs accurate scene-level kinematic and geometric models, enabling long-horizon mobile manipulation in a real-world scene. Code and additional results are available at https://chengchunhsu.github.io/KinScene/
翻译:顺序与关节化物体交互对于移动机械臂在日常环境中有效运行至关重要。为实现涉及关节化物体的长时域任务,本研究探索通过自主探索为室内场景构建场景级关节模型。先前研究虽已通过考虑物体运动学约束来探索移动机械臂与关节化物体的交互,但主要聚焦于单物体场景,缺乏向任务级规划所需的场景级上下文的扩展。为顺序操控多个物体部件,机器人需要推理每个部件的合成运动并预判其对后续动作的影响。本文提出 \ourtool{}——一种面向关节化物体长时域操控任务的全栈方法。机器人通过场景建图、检测并与关节化物体进行物理交互、收集观测数据,进而推断关节属性。对于顺序任务,机器人根据推断的关节模型规划可行的物体交互序列。实验表明,我们的方法能够重复构建精确的场景级运动学与几何模型,从而在真实场景中实现长时域移动操控。代码与更多结果详见 https://chengchunhsu.github.io/KinScene/