Human-Computer Interaction (HCI) has been the subject of research for many years, and recent studies have focused on improving its performance through various techniques. In the past decade, deep learning studies have shown high performance in various research areas, leading researchers to explore their application to HCI. Convolutional neural networks can be used to recognize hand gestures from images using deep architectures. In this study, we evaluated pre-trained high-performance deep architectures on the HG14 dataset, which consists of 14 different hand gesture classes. Among 22 different models, versions of the VGGNet and MobileNet models attained the highest accuracy rates. Specifically, the VGG16 and VGG19 models achieved accuracy rates of 94.64% and 94.36%, respectively, while the MobileNet and MobileNetV2 models achieved accuracy rates of 96.79% and 94.43%, respectively. We performed hand gesture recognition on the dataset using an ensemble learning technique, which combined the four most successful models. By utilizing these models as base learners and applying the Dirichlet ensemble technique, we achieved an accuracy rate of 98.88%. These results demonstrate the effectiveness of the deep ensemble learning technique for HCI and its potential applications in areas such as augmented reality, virtual reality, and game technologies.
翻译:人机交互(Human-Computer Interaction, HCI)一直是多年来的研究主题,近期研究致力于通过多种技术提升其性能。过去十年间,深度学习研究在多个领域展现出卓越性能,促使研究者探索其应用于人机交互的可能性。卷积神经网络可通过深层架构从图像中识别手势。本研究在包含14种不同手势类别的HG14数据集上,评估了预训练的高性能深度架构。在22种不同模型中,VGGNet和MobileNet系列的版本取得了最高准确率。具体而言,VGG16和VGG19模型分别达到94.64%和94.36%的准确率,而MobileNet和MobileNetV2模型则分别取得96.79%和94.43%的准确率。我们采用集成学习技术对数据集进行手势识别,该技术融合了四种最优模型。通过将这些模型作为基学习器并应用Dirichlet集成方法,我们实现了98.88%的准确率。这些结果证明了深度集成学习技术在人机交互中的有效性,及其在增强现实、虚拟现实和游戏技术等领域的潜在应用价值。