Model-free control strategies such as reinforcement learning have shown the ability to learn control strategies without requiring an accurate model or simulator of the world. While this is appealing due to the lack of modeling requirements, such methods can be sample inefficient, making them impractical in many real-world domains. On the other hand, model-based control techniques leveraging accurate simulators can circumvent these challenges and use a large amount of cheap simulation data to learn controllers that can effectively transfer to the real world. The challenge with such model-based techniques is the requirement for an extremely accurate simulation, requiring both the specification of appropriate simulation assets and physical parameters. This requires considerable human effort to design for every environment being considered. In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world. Our approach critically relies on utilizing an initial (possibly inaccurate) simulator to design effective exploration policies that, when deployed in the real world, collect high-quality data. We demonstrate the efficacy of this paradigm in identifying articulation, mass, and other physical parameters in several challenging robotic manipulation tasks, and illustrate that only a small amount of real-world data can allow for effective sim-to-real transfer. Project website at https://weirdlabuw.github.io/asid
翻译:无模型控制策略(如强化学习)已被证明能够在无需精确模型或世界模拟器的情况下学习控制策略。尽管这一特性因无需建模而具有吸引力,但此类方法可能样本效率低下,从而在许多真实场景中难以实际应用。另一方面,利用精确模拟器的基于模型的控制技术可以规避这些挑战,通过大量廉价的仿真数据学习能有效迁移至真实世界的控制器。这类基于模型技术的难点在于需要极其精确的仿真,这既涉及合适的仿真资产规范,也涉及物理参数的设定,且需针对每个环境投入大量人力进行设计。本文提出一种学习系统,该系统能利用少量真实数据自主改进仿真模型,进而规划可在真实世界部署的精准控制策略。该方法的关键在于利用初始(可能不精确)的模拟器设计高效的探索策略,这些策略在真实世界部署时能收集高质量数据。我们通过在多项具有挑战性的机器人操作任务中识别关节、质量及其他物理参数,验证了这一范式的有效性,并表明仅需少量真实数据即可实现有效的仿真到现实迁移。项目网站:https://weirdlabuw.github.io/asid