Vision-Based Hand Gesture Customization from a Single Demonstration

Hand gesture recognition is becoming a more prevalent mode of human-computer interaction, especially as cameras proliferate across everyday devices. Despite continued progress in this field, gesture customization is often underexplored. Customization is crucial since it enables users to define and demonstrate gestures that are more natural, memorable, and accessible. However, customization requires efficient usage of user-provided data. We introduce a method that enables users to easily design bespoke gestures with a monocular camera from one demonstration. We employ transformers and meta-learning techniques to address few-shot learning challenges. Unlike prior work, our method supports any combination of one-handed, two-handed, static, and dynamic gestures, including different viewpoints, and the ability to handle irrelevant hand movements. We implement three real-world applications using our customization method, conduct a user study, and achieve up to 94% average recognition accuracy from one demonstration. Our work provides a viable path for vision-based gesture customization, laying the foundation for future advancements in this domain.

翻译：手势识别正日益成为人机交互的主流模式，尤其是随着摄像头在日常设备中的普及。尽管该领域持续取得进展，手势定制功能却往往未被充分探索。定制功能至关重要，因为它允许用户定义并演示更自然、易记忆且易于使用的手势。然而，定制过程需要高效利用用户提供的数据。本文提出一种方法，使用户能够通过单目摄像头仅需一次演示即可轻松设计定制手势。我们采用Transformer架构与元学习技术以应对小样本学习挑战。与先前工作不同，本方法支持单手、双手、静态与动态手势的任意组合，包括不同视角下的手势，并能处理无关的手部动作。我们使用该定制方法实现了三个实际应用，开展了用户研究，并在单次演示条件下实现了高达94%的平均识别准确率。本研究为基于视觉的手势定制提供了可行路径，为该领域的未来发展奠定了基础。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日