GelFlow: Self-supervised Learning of Optical Flow for Vision-Based Tactile Sensor Displacement Measurement

High-resolution multi-modality information acquired by vision-based tactile sensors can support more dexterous manipulations for robot fingers. Optical flow is low-level information directly obtained by vision-based tactile sensors, which can be transformed into other modalities like force, geometry and depth. Current vision-tactile sensors employ optical flow methods from OpenCV to estimate the deformation of markers in gels. However, these methods need to be more precise for accurately measuring the displacement of markers during large elastic deformation of the gel, as this can significantly impact the accuracy of downstream tasks. This study proposes a self-supervised optical flow method based on deep learning to achieve high accuracy in displacement measurement for vision-based tactile sensors. The proposed method employs a coarse-to-fine strategy to handle large deformations by constructing a multi-scale feature pyramid from the input image. To better deal with the elastic deformation caused by the gel, the Helmholtz velocity decomposition constraint combined with the elastic deformation constraint are adopted to address the distortion rate and area change rate, respectively. A local flow fusion module is designed to smooth the optical flow, taking into account the prior knowledge of the blurred effect of gel deformation. We trained the proposed self-supervised network using an open-source dataset and compared it with traditional and deep learning-based optical flow methods. The results show that the proposed method achieved the highest displacement measurement accuracy, thereby demonstrating its potential for enabling more precise measurement of downstream tasks using vision-based tactile sensors.

翻译：视觉触觉传感器获取的高分辨率多模态信息可支持机器人手指执行更灵巧的操作。光流作为视觉触觉传感器直接获取的底层信息，可转换至力、几何、深度等其他模态。现有视觉触觉传感器采用OpenCV中的光流方法来估计凝胶标记点的形变，但在凝胶大幅弹性形变过程中，这些方法难以精确测量标记点的位移，进而严重影响下游任务的精度。本研究提出一种基于深度学习的自监督光流方法，以实现视觉触觉传感器位移测量的高精度。该方法采用从粗到精的策略，通过构建输入图像的多尺度特征金字塔来应对大幅形变。为更好地处理凝胶弹性形变，分别引入亥姆霍兹速度分解约束与弹性形变约束以解决畸变率与面积变化率问题。此外，考虑凝胶形变模糊效应的先验知识，设计局部光流融合模块以平滑光流。我们采用公开数据集训练所提出的自监督网络，并与传统及基于深度学习的光流方法进行对比。实验结果表明，本方法实现了最高的位移测量精度，展现了其在基于视觉触觉传感器的下游任务中实现更精确测量的潜力。