Learning robust robot policies in real-world environments requires diverse data augmentation, yet scaling real-world data collection is costly due to the need for acquiring physical assets and reconfiguring environments. Therefore, augmenting real-world scenes into simulation has become a practical augmentation for efficient learning and evaluation. We present a generative framework that establishes a generative real-to-sim mapping from real-world panoramas to high-fidelity simulation scenes, and further synthesize diverse cousin scenes via semantic and geometric editing. Combined with high-quality physics engines and realistic assets, the generated scenes support interactive manipulation tasks. Additionally, we incorporate multi-room stitching to construct consistent large-scale environments for long-horizon navigation across complex layouts. Experiments demonstrate a strong sim-to-real correlation validating our platform's fidelity, and show that extensively scaling up data generation leads to significantly better generalization to unseen scene and object variations, demonstrating the effectiveness of Digital Cousins for generalizable robot learning and evaluation.
翻译:在真实世界环境中学习鲁棒的机器人策略需要多样化的数据增强,然而由于需要获取物理资产并重新配置环境,扩展真实世界数据收集的成本高昂。因此,将真实场景增强至仿真环境已成为一种高效学习与评估的实用增强手段。本文提出一种生成式框架,建立从真实全景图像到高保真仿真场景的生成式实到模映射,并通过语义与几何编辑进一步合成多样化的孪生兄弟场景。结合高质量物理引擎与真实感资产,生成的场景支持交互式操作任务。此外,我们引入多房间拼接技术,构建用于复杂布局下长程导航的一致大规模环境。实验结果表明,该平台具有强仿真-真实相关性验证其保真度,且大规模扩展数据生成可显著提升对未见场景与物体变化的泛化能力,从而证明了数字孪生兄弟在可泛化机器人学习与评估中的有效性。