Tendon Force Modeling for Sim2Real Transfer of Reinforcement Learning Policies for Tendon-Driven Robots

Robots which make use of soft or compliant inter- actions often leverage tendon-driven actuation which enables actuators to be placed more flexibly, and compliance to be maintained. However, controlling complex tendon systems is challenging. Simulation paired with reinforcement learning (RL) could be enable more complex behaviors to be generated. Such methods rely on torque and force-based simulation roll- outs which are limited by the sim-to-real gap, stemming from the actuator and system dynamics, resulting in poor transfer of RL policies onto real robots. To address this, we propose a method to model the tendon forces produced by typical servo motors, focusing specifically on the transfer of RL policies for a tendon driven finger. Our approach extends existing data- driven techniques by leveraging contextual history and a novel data collection test-bench. This test-bench allows us to capture tendon forces undergo contact-rich interactions typical of real- world manipulation. We then utilize our force estimation model in a GPU-accelerated tendon force-driven rigid body simulation to train RL-based controllers. Our transformer-based model is capable of predicting tendon forces within 3% of the maximum motor force and is robot-agnostic. By integrating our learned model into simulation, we reduce the sim-to-real gap for test trajectories by 41%. RL-based controller trained with our model achieves a 50% improvement in fingertip pose tracking tasks on real tendon-driven robotic fingers. This approach is generalizable to different actuators and robot systems, and can enable RL policies to be used widely across tendon systems, advancing capabilities of dexterous manipulators and soft robots.

翻译：利用软性或顺应性交互的机器人通常采用肌腱驱动执行方式，这使得执行器的布置更加灵活，并能保持顺应性。然而，控制复杂的肌腱系统具有挑战性。仿真与强化学习相结合，可能有助于生成更复杂的行为。此类方法依赖于基于扭矩和力的仿真推演，这些推演受到仿真与现实差距的限制，这种差距源于执行器和系统的动力学特性，导致强化学习策略向真实机器人的迁移效果不佳。为解决此问题，我们提出了一种对典型伺服电机产生的肌腱力进行建模的方法，特别关注肌腱驱动手指的强化学习策略迁移。我们的方法通过利用上下文历史记录和一个新颖的数据采集测试台，扩展了现有的数据驱动技术。该测试台使我们能够捕获肌腱在经历现实世界操作中典型的接触丰富交互时所受的力。然后，我们在一个GPU加速的、由肌腱力驱动的刚体仿真中，利用我们的力估计模型来训练基于强化学习的控制器。我们基于Transformer的模型能够将肌腱力预测误差控制在最大电机力的3%以内，并且与具体机器人无关。通过将我们学习到的模型集成到仿真中，我们将测试轨迹的仿真与现实差距减少了41%。使用我们模型训练的基于强化学习的控制器，在真实肌腱驱动机器人手指的指尖位姿跟踪任务中实现了50%的性能提升。该方法可推广到不同的执行器和机器人系统，并能使强化学习策略在肌腱系统中得到广泛应用，从而提升灵巧操作器和软体机器人的能力。