We present an integrated (or end-to-end) framework for the Real2Sim2Real problem of manipulating deformable linear objects (DLOs) based on visual perception. Working with a parameterised set of DLOs, we use likelihood-free inference (LFI) to compute the posterior distributions for the physical parameters using which we can approximately simulate the behaviour of each specific DLO. We use these posteriors for domain randomisation while training, in simulation, object-specific visuomotor policies (i.e. assuming only visual and proprioceptive sensory) for a DLO reaching task, using model-free reinforcement learning. We demonstrate the utility of this approach by deploying sim-trained DLO manipulation policies in the real world in a zero-shot manner, i.e. without any further fine-tuning. In this context, we evaluate the capacity of a prominent LFI method to perform fine classification over the parametric set of DLOs, using only visual and proprioceptive data obtained in a dynamic manipulation trajectory. We then study the implications of the resulting domain distributions in sim-based policy learning and real-world performance.
翻译:我们提出了一种用于解决基于视觉感知的可变形线性物体操控的Real2Sim2Real问题的集成(即端到端)框架。针对一组参数化的DLO,我们采用无似然推断来推算物理参数的后验分布,利用这些分布可以近似模拟每个具体DLO的行为。在训练过程中,我们使用这些后验分布进行领域随机化,在仿真环境中通过无模型强化学习,为DLO到达任务训练物体特定的视觉运动策略(即仅假设具备视觉和本体感觉感知能力)。我们通过以零样本方式(即无需任何进一步微调)将仿真训练的DLO操控策略部署到现实世界,展示了该方法的实用性。在此背景下,我们评估了一种主流LFI方法在仅使用动态操控轨迹中获取的视觉和本体感觉数据时,对参数化DLO集合执行精细分类的能力。随后,我们研究了所得领域分布在基于仿真的策略学习及现实世界性能中的影响。