LSA64: An Argentinian Sign Language Dataset

Automatic sign language recognition is a research area that encompasses human-computer interaction, computer vision and machine learning. Robust automatic recognition of sign language could assist in the translation process and the integration of hearing-impaired people, as well as the teaching of sign language to the hearing population. Sign languages differ significantly in different countries and even regions, and their syntax and semantics are different as well from those of written languages. While the techniques for automatic sign language recognition are mostly the same for different languages, training a recognition system for a new language requires having an entire dataset for that language. This paper presents a dataset of 64 signs from the Argentinian Sign Language (LSA). The dataset, called LSA64, contains 3200 videos of 64 different LSA signs recorded by 10 subjects, and is a first step towards building a comprehensive research-level dataset of Argentinian signs, specifically tailored to sign language recognition or other machine learning tasks. The subjects that performed the signs wore colored gloves to ease the hand tracking and segmentation steps, allowing experiments on the dataset to focus specifically on the recognition of signs. We also present a pre-processed version of the dataset, from which we computed statistics of movement, position and handshape of the signs.

翻译：自动手语识别是一个涵盖人机交互、计算机视觉和机器学习的研究领域。鲁棒的自动手语识别有助于翻译过程、听力障碍者的融入，以及向听力正常人群教授手语。不同国家甚至地区的手语差异显著，其语法和语义也与书面语言截然不同。尽管不同语言的自动手语识别技术大体相同，但为新语言训练识别系统需要该语言的完整数据集。本文提出了来自阿根廷手语（LSA）的64个手势数据集。该数据集名为LSA64，包含由10名受试者记录的64种不同LSA手势的3200个视频，是构建综合性研究级阿根廷手语数据集的第一步，专门针对手语识别或其他机器学习任务而定制。执行手势的受试者佩戴了彩色手套以简化手部跟踪和分割步骤，使数据集上的实验能够专门聚焦于手势识别。我们还提供了数据集的预处理版本，并据此计算了手势的运动、位置和手形统计信息。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日