Model-free control strategies such as reinforcement learning have shown the ability to learn control strategies without requiring an accurate model or simulator of the world. While this is appealing due to the lack of modeling requirements, such methods can be sample inefficient, making them impractical in many real-world domains. On the other hand, model-based control techniques leveraging accurate simulators can circumvent these challenges and use a large amount of cheap simulation data to learn controllers that can effectively transfer to the real world. The challenge with such model-based techniques is the requirement for an extremely accurate simulation, requiring both the specification of appropriate simulation assets and physical parameters. This requires considerable human effort to design for every environment being considered. In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world. Our approach critically relies on utilizing an initial (possibly inaccurate) simulator to design effective exploration policies that, when deployed in the real world, collect high-quality data. We demonstrate the efficacy of this paradigm in identifying articulation, mass, and other physical parameters in several challenging robotic manipulation tasks, and illustrate that only a small amount of real-world data can allow for effective sim-to-real transfer. Project website at https://weirdlabuw.github.io/asid
翻译:无模型控制策略(如强化学习)已展现出无需精确世界模型或模拟器即可学习控制策略的能力。尽管这种方法因无需建模而颇具吸引力,但其样本效率可能较低,导致在许多现实场景中难以实际应用。另一方面,基于模型的控制技术利用精确模拟器能够规避这些挑战,通过大量廉价仿真数据学习控制器,并有效迁移至现实世界。此类技术面临的挑战在于对极高精度模拟的需求,这既需要合适的仿真资产配置,也依赖精确的物理参数设定。针对每个特定环境进行此类设计需耗费大量人力。本研究提出一种学习系统,能够利用少量真实世界数据自主优化仿真模型,进而规划可部署于现实世界的精确控制策略。该方法的核心在于利用初始(可能不精确的)模拟器设计高效的探索策略,当在现实世界中部署时,这些策略能够收集高质量数据。我们通过多个具有挑战性的机器人操作任务验证了该范式在辨识关节属性、质量及其他物理参数方面的有效性,并证明仅需少量真实世界数据即可实现有效的仿真到现实迁移。项目网站:https://weirdlabuw.github.io/asid