Accurately manipulating articulated objects is a challenging yet important task for real robot applications. In this paper, we present a novel framework called Sim2Real$^2$ to enable the robot to manipulate an unseen articulated object to the desired state precisely in the real world with no human demonstrations. We leverage recent advances in physics simulation and learning-based perception to build the interactive explicit physics model of the object and use it to plan a long-horizon manipulation trajectory to accomplish the task. However, the interactive model cannot be correctly estimated from a static observation. Therefore, we learn to predict the object affordance from a single-frame point cloud, control the robot to actively interact with the object with a one-step action, and capture another point cloud. Further, the physics model is constructed from the two point clouds. Experimental results show that our framework achieves about 70% manipulations with <30% relative error for common articulated objects, and 30% manipulations for difficult objects. Our proposed framework also enables advanced manipulation strategies, such as manipulating with different tools. Code and videos are available on our project webpage: https://ttimelord.github.io/Sim2Real2-site/
翻译:精准操控铰接物体是真实机器人应用中一项具有挑战性且重要的任务。本文提出了一种名为Sim2Real$^2$的新框架,使机器人能够在无需人类示教的情况下,在真实世界中精确地将未见过的铰接物体操控至目标状态。我们利用物理仿真与基于学习的感知技术的最新进展,构建物体的交互式显式物理模型,并据此规划长时域操控轨迹以完成任务。然而,交互式模型无法通过静态观测正确估计。因此,我们学习从单帧点云预测物体功能属性,控制机器人以单步动作主动与物体交互,并捕获另一帧点云。进而基于这两帧点云构建物理模型。实验结果表明,对于常见铰接物体,我们的框架实现了约70%的成功操控且相对误差低于30%;对于复杂物体则达到30%的成功率。该框架还支持高级操控策略,例如使用不同工具进行操控。代码与视频可在项目主页获取:https://ttimelord.github.io/Sim2Real2-site/