Gesture recognition is an indispensable component of natural and efficient human-computer interaction technology, particularly in desktop-level applications, where it can significantly enhance people's productivity. However, the current gesture recognition community lacks a suitable desktop-level (top-view perspective) dataset for lightweight gesture capture devices. In this study, we have established a dataset named GR4DHCI. What distinguishes this dataset is its inherent naturalness, intuitive characteristics, and diversity. Its primary purpose is to serve as a valuable resource for the development of desktop-level portable applications. GR4DHCI comprises over 7,000 gesture samples and a total of 382,447 frames for both Stereo IR and skeletal modalities. We also address the variances in hand positioning during desktop interactions by incorporating 27 different hand positions into the dataset. Building upon the GR4DHCI dataset, we conducted a series of experimental studies, the results of which demonstrate that the fine-grained classification blocks proposed in this paper can enhance the model's recognition accuracy. Our dataset and experimental findings presented in this paper are anticipated to propel advancements in desktop-level gesture recognition research.
翻译:手势识别是实现自然高效人机交互技术不可或缺的组成部分,尤其在桌面级应用中,该技术能够显著提升人们的工作效率。然而,当前手势识别领域缺乏适用于轻量级手势捕捉设备的桌面级(俯视视角)数据集。本研究构建了名为GR4DHCI的数据集。该数据集的独特之处在于其固有的自然性、直观特性以及多样性,其核心目标是为桌面级便携式应用的开发提供宝贵资源。GR4DHCI包含超过7000个手势样本,涵盖立体红外与骨骼两种模态,共计382,447帧图像。针对桌面交互过程中手部位置存在的差异,我们在数据集中引入了27种不同的手部位置。基于GR4DHCI数据集,我们开展了一系列实验研究,结果表明本文提出的细粒度分类模块能够提升模型的识别精度。本文公开的数据集及实验结果有望推动桌面级手势识别研究的进展。