Accurate 3D tracking of hand and fingers movements poses significant challenges in computer vision. The potential applications span across multiple domains, including human-computer interaction, virtual reality, industry, and medicine. While gesture recognition has achieved remarkable accuracy, quantifying fine movements remains a hurdle, particularly in clinical applications where the assessment of hand dysfunctions and rehabilitation training outcomes necessitate precise measurements. Several novel and lightweight frameworks based on Deep Learning have emerged to address this issue; however, their performance in accurately and reliably measuring fingers movements requires validation against well-established gold standard systems. In this paper, the aim is to validate the handtracking framework implemented by Google MediaPipe Hand (GMH) and an innovative enhanced version, GMH-D, that exploits the depth estimation of an RGB-Depth camera to achieve more accurate tracking of 3D movements. Three dynamic exercises commonly administered by clinicians to assess hand dysfunctions, namely Hand Opening-Closing, Single Finger Tapping and Multiple Finger Tapping are considered. Results demonstrate high temporal and spectral consistency of both frameworks with the gold standard. However, the enhanced GMH-D framework exhibits superior accuracy in spatial measurements compared to the baseline GMH, for both slow and fast movements. Overall, our study contributes to the advancement of hand tracking technology, the establishment of a validation procedure as a good-practice to prove efficacy of deep-learning-based hand-tracking, and proves the effectiveness of GMH-D as a reliable framework for assessing 3D hand movements in clinical applications.
翻译:精确的三维手部及手指运动追踪在计算机视觉领域面临重大挑战。其潜在应用涵盖人机交互、虚拟现实、工业及医学等多个领域。尽管手势识别已取得显著精度,但量化细微运动仍是难题,尤其在临床应用中,手部功能障碍评估与康复训练效果验证需要精确测量。基于深度学习的新型轻量级框架为解决该问题提供了可能,但其准确可靠地测量手指运动的性能仍需与成熟金标准系统进行对比验证。本文旨在验证Google MediaPipe Hand(GMH)实现的手部追踪框架及其创新增强版本GMH-D——该版本利用RGB-Depth相机的深度估计实现更精确的三维运动追踪。研究选取临床医生评估手部功能障碍时常用的三种动态练习:手部开合运动、单指敲击与多指敲击。实验结果表明,两个框架与金标准在时域和频域上均具有高度一致性。但相较于基础GMH,增强型GMH-D框架在慢速与快速运动中均展现出更优越的空间测量精度。总体而言,本研究推动了手部追踪技术的发展,建立了验证深度学习手部追踪框架有效性的标准化流程,并证明了GMH-D作为临床应用中三维手部运动可靠评估框架的有效性。