Sensing contact pressure applied by a gripper is useful for autonomous and teleoperated robotic manipulation, but adding tactile sensing to a gripper's surface can be difficult or impractical. If a gripper visibly deforms when forces are applied, contact pressure can be visually estimated using images from an external camera that observes the gripper. While researchers have demonstrated this capability in controlled laboratory settings, prior work has not addressed challenges associated with visual pressure estimation in the wild, where lighting, surfaces, and other factors vary widely. We present a deep learning model and associated methods that enable visual pressure estimation under widely varying conditions. Our model, Visual Pressure Estimation for Robots (ViPER), takes an image from an eye-in-hand camera as input and outputs an image representing the pressure applied by a soft gripper. Our key insight is that force/torque sensing can be used as a weak label to efficiently collect training data in settings where pressure measurements would be difficult to obtain. When trained on this weakly labeled data combined with fully labeled data containing pressure measurements, ViPER outperforms prior methods, enables precision manipulation in cluttered settings, and provides accurate estimates for unseen conditions relevant to in-home use.
翻译:自主和远程机器人操作中,感知夹爪施加的接触压力十分有用,但在夹爪表面添加触觉传感可能困难或不切实际。若夹爪在受力时发生可见形变,则可通过外部摄像头观察夹爪的图像来视觉估计接触压力。尽管研究人员已在受控实验室环境中展示了该能力,但先前工作未解决野外环境下视觉压力估计所面临的挑战——这些环境下光照、表面等条件变化极大。我们提出一种深度学习模型及相关方法,可在广泛变化的条件下实现视觉压力估计。我们的模型ViPER(机器人视觉压力估计)以眼在手上摄像头捕获的图像为输入,输出表示软体夹爪施加压力的图像。关键思路在于:可利用力/力矩传感作为弱标签,高效收集难以获取压力测量的训练数据。当使用包含压力测量的全标注数据与弱标注数据共同训练时,ViPER性能优于此前方法,能在杂乱环境中实现精密操作,并为家庭使用相关的未知场景提供准确估计。