This study focuses on Hand Gesture Recognition (HGR), which is vital for perceptual computing across various real-world contexts. The primary challenge in the HGR domain lies in dealing with the individual variations inherent in human hand morphology. To tackle this challenge, we introduce an innovative HGR framework that combines data-level fusion and an Ensemble Tuner Multi-stream CNN architecture. This approach effectively encodes spatiotemporal gesture information from the skeleton modality into RGB images, thereby minimizing noise while improving semantic gesture comprehension. Our framework operates in real-time, significantly reducing hardware requirements and computational complexity while maintaining competitive performance on benchmark datasets such as SHREC2017, DHG1428, FPHA, LMDHG and CNR. This improvement in HGR demonstrates robustness and paves the way for practical, real-time applications that leverage resource-limited devices for human-machine interaction and ambient intelligence.
翻译:本研究聚焦于手势识别(HGR),该技术对于多种现实场景中的感知计算至关重要。HGR领域的主要挑战在于处理人类手部形态固有的个体差异。为应对这一挑战,我们提出了一种创新的HGR框架,该框架结合了数据级融合与集成调谐多流CNN架构。该方法能够将骨架模态中的时空手势信息有效编码至RGB图像中,从而在提升语义手势理解能力的同时最大限度地减少噪声干扰。我们的框架可实现实时运行,在保持SHREC2017、DHG1428、FPHA、LMDHG及CNR等基准数据集上竞争力的同时,显著降低了硬件需求与计算复杂度。这种HGR性能的改进展现了系统的鲁棒性,并为基于资源受限设备的人机交互与环境智能等实际实时应用开辟了道路。