Hand gesture recognition (HGR) is a fundamental technology in human computer interaction (HCI).In particular, HGR based on Doppler radar signals is suited for in-vehicle interfaces and robotic systems, necessitating lightweight and computationally efficient recognition techniques. However, conventional deep learning-based methods still suffer from high computational costs. To address this issue, we propose an Echo State Network (ESN) approach for radar-based HGR, using frequency-modulated-continuous-wave (FMCW) radar signals. Raw radar data is first converted into feature maps, such as range-time and Doppler-time maps, which are then fed into one or more recurrent neural network-based reservoirs. The obtained reservoir states are processed by readout classifiers, including ridge regression, support vector machines, and random forests. Comparative experiments demonstrate that our method outperforms existing approaches on an 11-class HGR task using the Soli dataset and surpasses existing deep learning models on a 4-class HGR task using the Dop-NET dataset. The results indicate that parallel processing using multi-reservoir ESNs are effective for recognizing temporal patterns from the multiple different feature maps in the time-space and time-frequency domains. Our ESN approaches achieve high recognition performance with low computational cost in HGR, showing great potential for more advanced HCI technologies, especially in resource-constrained environments.
翻译:手势识别(HGR)是人机交互(HCI)中的一项基础技术。特别地,基于多普勒雷达信号的手势识别适用于车载界面和机器人系统,需要轻量且计算高效的识别技术。然而,传统的基于深度学习的方法仍面临计算成本高的问题。为解决此问题,我们提出了一种基于调频连续波(FMCW)雷达信号、利用回声状态网络(ESN)的雷达手势识别方法。原始雷达数据首先被转换为特征图,如距离-时间图和多普勒-时间图,随后输入一个或多个基于循环神经网络的储备池。获取的储备池状态通过读出分类器(包括岭回归、支持向量机和随机森林)进行处理。对比实验表明,在基于Soli数据集的11类手势识别任务中,我们的方法优于现有方法;在基于Dop-NET数据集的4类手势识别任务中,我们的方法超越了现有深度学习模型。结果表明,使用多储备池ESN进行并行处理,能有效识别时空域和时频域中多种不同特征图的时间模式。我们的ESN方法以较低的计算成本实现了高性能的手势识别,在资源受限环境中展现出推动更先进人机交互技术的巨大潜力。