The success of machine learning is deeply linked to the availability of high-quality training data, yet retrieving and manually labeling new data remains a time-consuming and error-prone process. Traditional annotation tools, such as Label Studio, often require post-processing, where users label data after it has been recorded. Post-processing is highly time-consuming and labor-intensive, especially with large datasets, and may lead to erroneous annotations due to the difficulty of subjects' memory tasks when labeling cognitive activities such as emotions or comprehension levels. In this work, we introduce HandyLabel, a real-time annotation tool that leverages hand gesture recognition to map hand signs for labeling. The application enables users to customize gesture mappings through a web-based interface, allowing for real-time annotations. To ensure the performance of HandyLabel, we evaluate several hand gesture recognition models on an open-source hand sign (HaGRID) dataset, with and without skeleton-based preprocessing. We discovered that ResNet50 with preprocessed skeleton-based images performs an F1-score of 0.923. To validate the usability of HandyLabel, a user study was conducted with 46 participants. The results suggest that 88.9% of participants preferred HandyLabel over traditional annotation tools.
翻译:机器学习的成功与高质量训练数据的可用性密切相关,然而获取和手动标注新数据仍是一个耗时且易出错的过程。传统标注工具(如Label Studio)通常需要后处理,即用户在数据录制完成后进行标注。后处理方式极为耗时费力,尤其在处理大规模数据集时,且因受试者在标注情绪或理解水平等认知活动时记忆任务的困难,可能导致错误标注。本研究提出了HandyLabel——一种利用手势识别将手部姿态映射为标注操作的实时标注工具。该应用允许用户通过网页界面自定义手势映射,实现实时标注。为确保HandyLabel的性能,我们在开源手语数据集(HaGRID)上评估了多种手势识别模型,并对比了有无基于骨架的预处理方法。实验发现,经骨架预处理图像的ResNet50模型F1分数达0.923。为验证HandyLabel的可用性,我们招募46名参与者进行用户研究,结果表明88.9%的参与者更倾向于使用HandyLabel而非传统标注工具。