Robot learning increasingly relies on simulation to advance complex ability such as dexterous manipulations and precise interactions, necessitating high-quality digital assets to bridge the sim-to-real gap. However, existing open-source articulated-object datasets for simulation are limited by insufficient visual realism and low physical fidelity, which hinder their utility for training models mastering robotic tasks in real world. To address these challenges, we introduce ArtVIP, a comprehensive open-source dataset comprising high-quality digital-twin articulated objects, accompanied by indoor-scene assets. Crafted by professional 3D modelers adhering to unified standards, ArtVIP ensures visual realism through precise geometric meshes and high-resolution textures, while physical fidelity is achieved via fine-tuned dynamic parameters. Meanwhile, the dataset pioneers embedded modular interaction behaviors within assets and pixel-level affordance annotations. Feature-map visualization and optical motion capture are employed to quantitatively demonstrate ArtVIP's visual and physical fidelity, with its applicability validated across imitation learning and reinforcement learning experiments. Provided in USD format with detailed production guidelines, ArtVIP is fully open-source, benefiting the research community and advancing robot learning research. Our project is at https://x-humanoid-artvip.github.io/ .
翻译:机器人学习日益依赖仿真来推进灵巧操作与精确交互等复杂能力,这需要高质量的数字资产来弥合仿真与现实的差距。然而,现有用于仿真的开源关节物体数据集受限于视觉真实感不足和物理保真度低的问题,阻碍了其在训练模型以掌握现实世界机器人任务方面的效用。为应对这些挑战,我们推出了ArtVIP,这是一个全面的开源数据集,包含高质量的数字孪生关节物体以及配套的室内场景资产。ArtVIP由专业三维建模师遵循统一标准制作,通过精确的几何网格和高分辨率纹理确保视觉真实感,同时通过微调的动态参数实现物理保真度。此外,该数据集开创性地在资产中嵌入了模块化交互行为,并提供了像素级的可供性标注。我们采用特征图可视化和光学动作捕捉技术,定量展示了ArtVIP的视觉与物理保真度,并通过模仿学习和强化学习实验验证了其适用性。ArtVIP以USD格式提供,并附有详细的生产指南,完全开源,旨在惠及研究社区并推动机器人学习研究的发展。项目主页为 https://x-humanoid-artvip.github.io/ 。