In-hand object manipulation is a fundamental yet challenging capability for dexterous robots. Despite significant progress in dexterous manipulation, existing approaches rely heavily on vision or tactile sensing to track object states, while joint sensing -- the most readily available modality on any robotic hand -- remains largely overlooked, particularly for tendon-driven hands. In this paper, we study how far joint sensing alone can go by asking: (i) whether motor encoders or direct joint sensing provides better proprioceptive feedback, (ii) how to extract environment information from joint measurements, and (iii) whether joint-only control can achieve competitive real-world performance without external perception. We present the Proprioceptive Transformer (PT), an exteroceptive-free approach for continuous cube rotation on a tendon-driven dexterous hand that uses only joint sensing feedback. A teacher policy is first trained via reinforcement learning with privileged object information, then distilled into PT, which operates solely on joint position and velocity histories. The Transformer architecture effectively extracts implicit object state information from temporal patterns in joint sensor readings. Experiments on the real ORCA hand show that our approach achieves 3.1x higher rotation speed than baselines. We also demonstrate that our PT achieves a 23.4% lower RMSE for cube position estimation than the MLP baseline, indicating superior extraction of exteroceptive information from proprioceptive sources.
翻译:手内操控是灵巧机器人的一项基础但具有挑战性的能力。尽管灵巧操控取得了显著进展,现有方法严重依赖视觉或触觉感知来跟踪物体状态,而关节传感——任何机器人手上最容易获取的模态——在很大程度上仍被忽视,特别是对于腱驱动手。本文通过提出以下问题,研究仅凭关节传感能达到何种程度:(i)电机编码器或直接关节传感哪种能提供更好的本体反馈,(ii)如何从关节测量中提取环境信息,以及(iii)仅依赖关节控制的策略是否能在无需外部感知的情况下实现竞争性的实际性能。我们提出了本体感Transformer(PT),一种无需外部感知的方法,用于腱驱动灵巧手上的连续立方体旋转,仅使用关节传感反馈。首先通过具有特权物体信息的强化学习训练教师策略,然后将其蒸馏到PT中,PT仅依据关节位置和速度历史进行操作。Transformer架构有效从关节传感器读数的时间模式中提取隐式物体状态信息。在实际ORCA手上的实验表明,我们的方法比基线实现了3.1倍更高的旋转速度。我们还展示了PT在立方体位置估计上比MLP基线实现了23.4%更低的RMSE,表明从本体感知源中提取外部感知信息的优越性。