Scaling robot learning requires data collection pipelines that scale favorably with human effort. In this work, we propose Crowdsourcing and Amortizing Human Effort for Real-to-Sim-to-Real(CASHER), a pipeline for scaling up data collection and learning in simulation where the performance scales superlinearly with human effort. The key idea is to crowdsource digital twins of real-world scenes using 3D reconstruction and collect large-scale data in simulation, rather than the real-world. Data collection in simulation is initially driven by RL, bootstrapped with human demonstrations. As the training of a generalist policy progresses across environments, its generalization capabilities can be used to replace human effort with model generated demonstrations. This results in a pipeline where behavioral data is collected in simulation with continually reducing human effort. We show that CASHER demonstrates zero-shot and few-shot scaling laws on three real-world tasks across diverse scenarios. We show that CASHER enables fine-tuning of pre-trained policies to a target scenario using a video scan without any additional human effort. See our project website: https://casher-robot-learning.github.io/CASHER/
翻译:扩展机器人学习需要数据收集流程能够随人类投入而高效扩展。本研究提出"众筹与分摊人力投入的实-仿-实转换"(CASHER)流程,通过在仿真环境中扩展数据收集与学习,实现性能随人类投入呈超线性增长。核心思想是利用三维重建技术众包真实场景的数字孪生,在仿真环境中而非现实世界收集大规模数据。仿真数据收集初始阶段由强化学习驱动,并通过人类示范进行引导。随着通用策略在多种环境中的训练推进,其泛化能力可逐步替代人类示范,转为模型生成示范。由此形成的数据收集流程能在仿真环境中持续降低人类投入的同时获取行为数据。实验证明,CASHER在三种跨场景现实任务中展现出零样本与少样本的扩展规律。研究还表明,CASHER能够仅通过视频扫描对预训练策略进行目标场景微调,无需额外人力投入。项目网站详见:https://casher-robot-learning.github.io/CASHER/