Robotic manipulation with deformable objects represents a data-intensive regime in embodied learning, where shape, contact, and topology co-evolve in ways that far exceed the variability of rigids. Although simulation promises relief from the cost of real-world data acquisition, prevailing sim-to-real pipelines remain rooted in rigid-body abstractions, producing mismatched geometry, fragile soft dynamics, and motion primitives poorly suited for cloth interaction. We posit that simulation fails not for being synthetic, but for being ungrounded. To address this, we introduce SIM1, a physics-aligned real-to-sim-to-real data engine that grounds simulation in the physical world. Given limited demonstrations, the system digitizes scenes into metric-consistent twins, calibrates deformable dynamics through elastic modeling, and expands behaviors via diffusion-based trajectory generation with quality filtering. This pipeline transforms sparse observations into scaled synthetic supervision with near-demonstration fidelity. Experiments show that policies trained on purely synthetic data achieve parity with real-data baselines at a 1:15 equivalence ratio, while delivering 90% zero-shot success and 50% generalization gains in real-world deployment. These results validate physics-aligned simulation as scalable supervision for deformable manipulation and a practical pathway for data-efficient policy learning.
翻译:摘要:可变形物体机器人操作代表了具身学习中一种数据密集型的范式,其中形状、接触和拓扑结构以远超刚性物体的复杂性协同演化。尽管仿真有望减轻真实世界数据采集的成本,但当前主流的从仿真到现实的流水线仍根植于刚体抽象,导致几何失配、软体动力学脆弱,且运动基元不适用于布料交互。我们认为,仿真失败并非因其合成属性,而在于缺乏物理依据。为解决此问题,我们引入SIM1——一种物理对齐的“真实-仿真-真实”数据引擎,将仿真锚定于物理世界。在有限演示条件下,该系统将场景数字化为度量一致的孪生体,通过弹性建模标定可变形动力学,并经由基于扩散的轨迹生成与质量过滤扩展行为。该流水线将稀疏观测转化为接近演示保真度的缩放合成监督。实验表明,仅基于合成数据训练的策略在1:15的等效比率下达到与真实数据基线相当的性能,并在实际部署中实现90%的零样本成功率和50%的泛化提升。这些结果验证了物理对齐仿真作为可变形操作的可扩展监督手段,以及数据高效策略学习的实用路径。