Sequentially interacting with articulated objects is crucial for a mobile manipulator to operate effectively in everyday environments. To enable long-horizon tasks involving articulated objects, this study explores building scene-level articulation models for indoor scenes through autonomous exploration. While previous research has studied mobile manipulation with articulated objects by considering object kinematic constraints, it primarily focuses on individual-object scenarios and lacks extension to a scene-level context for task-level planning. To manipulate multiple object parts sequentially, the robot needs to reason about the resultant motion of each part and anticipate its impact on future actions. We introduce KinScene, a full-stack approach for long-horizon manipulation tasks with articulated objects. The robot maps the scene, detects and physically interacts with articulated objects, collects observations, and infers the articulation properties. For sequential tasks, the robot plans a feasible series of object interactions based on the inferred articulation model. We demonstrate that our approach repeatably constructs accurate scene-level kinematic and geometric models, enabling long-horizon mobile manipulation in a real-world scene. Code and additional results are available at https://chengchunhsu.github.io/KinScene/
翻译:顺序与铰接物体交互对于移动机械臂在日常环境中有效运行至关重要。为实现涉及铰接物体的长时域任务,本研究探索通过自主探索为室内场景构建场景级铰接模型。虽然先前研究通过考虑物体运动学约束来探索铰接物体的移动操控,但其主要关注单物体场景,缺乏向任务级规划所需的场景级上下文扩展。为顺序操控多个物体部件,机器人需要推理每个部件的最终运动并预判其对后续动作的影响。我们提出KinScene——一种面向铰接物体长时域操控任务的全栈方法。机器人通过场景建图、检测并与铰接物体进行物理交互、收集观测数据,进而推断其铰接属性。对于顺序任务,机器人根据推断的铰接模型规划可行的物体交互序列。实验证明,我们的方法能够重复构建精确的场景级运动学与几何模型,从而在真实场景中实现长时域移动操控。代码及更多结果详见 https://chengchunhsu.github.io/KinScene/