Simulating object dynamics from real-world perception shows great promise for digital twins and robotic manipulation but often demands labor-intensive measurements and expertise. We present a fully automated Real2Sim pipeline that generates simulation-ready assets for real-world objects through robotic interaction. Using only a robot's joint torque sensors and an external camera, the pipeline identifies visual geometry, collision geometry, and physical properties such as inertial parameters. Our approach introduces a general method for extracting high-quality, object-centric meshes from photometric reconstruction techniques (e.g., NeRF, Gaussian Splatting) by employing alpha-transparent training while explicitly distinguishing foreground occlusions from background subtraction. We validate the full pipeline through extensive experiments, demonstrating its effectiveness across diverse objects. By eliminating the need for manual intervention or environment modifications, our pipeline can be integrated directly into existing pick-and-place setups, enabling scalable and efficient dataset creation. Project page (with code and data): https://scalable-real2sim.github.io/.
翻译:从真实世界感知中模拟物体动态为数字孪生和机器人操控带来了巨大潜力,但通常需要耗费大量人工测量和专业经验。我们提出了一种全自动的Real2Sim流程,通过机器人交互为真实物体生成可直接用于仿真的资产。该流程仅利用机器人的关节扭矩传感器和外部摄像头,即可识别物体的视觉几何、碰撞几何以及惯性参数等物理属性。我们的方法提出了一种通用技术,通过采用alpha透明训练并显式区分前景遮挡与背景剔除,从光度重建技术(如NeRF、高斯泼溅)中提取高质量、以物体为中心的三维网格。我们通过大量实验验证了完整流程的有效性,并在多种物体上证明了其优越性能。该流程无需人工干预或环境改造,可直接集成到现有的抓取放置系统中,从而实现可扩展且高效的数据集创建。项目页面(含代码与数据):https://scalable-real2sim.github.io/。