Reinforcement Learning (RL) has shown great promise for efficiently learning force control policies in peg-in-hole tasks. However, robots often face difficulties due to visual occlusions by the gripper and uncertainties in the initial grasping pose of the peg. These challenges often restrict force-controlled insertion policies to situations where the peg is rigidly fixed to the end-effector. While vision-based tactile sensors offer rich tactile feedback that could potentially address these issues, utilizing them to learn effective tactile policies is both computationally intensive and difficult to generalize. In this paper, we propose a robust tactile insertion policy that can align the tilted peg with the hole using active inference, without the need for extensive training on large datasets. Our approach employs a dual-policy architecture: one policy focuses on insertion, integrating force control and RL to guide the object into the hole, while the other policy performs active inference based on tactile feedback to align the tilted peg with the hole. In real-world experiments, our dual-policy architecture achieved 90% success rate into a hole with a clearance of less than 0.1 mm, significantly outperforming previous methods that lack tactile sensory feedback (5%). To assess the generalizability of our alignment policy, we conducted experiments with five different pegs, demonstrating its effective adaptation to multiple objects.
翻译:强化学习在轴孔装配任务的力控制策略学习中展现出巨大潜力。然而,由于夹爪对视觉的遮挡以及销钉初始抓取姿态的不确定性,机器人往往面临困难。这些挑战常将力控装配策略限制在销钉与末端执行器刚性固定的场景中。尽管基于视觉的触觉传感器能提供丰富的触觉反馈以解决上述问题,但利用其学习有效触觉策略既需要高计算成本又难以泛化。本文提出一种鲁棒的触觉装配策略,通过主动推理实现倾斜销钉与孔的对齐,无需大规模数据集训练。该方法采用双策略架构:一个策略专注于插入动作,集成力控与强化学习引导物体进入孔中;另一个策略基于触觉反馈执行主动推理,以对齐倾斜的销钉与孔。在真实实验中,我们的双策略架构在间隙小于0.1毫米的孔中实现了90%的成功率,显著优于缺乏触觉感知反馈的既有方法(5%)。为评估对齐策略的泛化能力,我们使用五种不同销钉开展实验,证明了该方法可有效适配多种物体。