Sign languages are the language of hearing-impaired people who use visuals like the hand, facial, and body movements for communication. There are different signs and gestures representing alphabets, words, and phrases. Nowadays approximately 300 sign languages are being practiced worldwide such as American Sign Language (ASL), Chinese Sign Language (CSL), Indian Sign Language (ISL), and many more. Sign languages are dependent on the vocal language of a place. Unlike vocal or spoken languages, there are no helping words in sign language like is, am, are, was, were, will, be, etc. As only a limited population is well-versed in sign language, this lack of familiarity of sign language hinders hearing-impaired people from communicating freely and easily with everyone. This issue can be addressed by a sign language recognition (SLR) system which has the capability to translate the sign language into vocal language. In this paper, a continuous SLR system is proposed using a deep learning model employing Long Short-Term Memory (LSTM), trained and tested on an ISL primary dataset. This dataset is created using MediaPipe Holistic pipeline for tracking face, hand, and body movements and collecting landmarks. The system recognizes the signs and gestures in real-time with 88.23% accuracy.
翻译:手语是听障人士使用的语言,通过手部、面部及身体动作等视觉元素进行交流。不同的手势与姿态代表字母、单词及短语。目前全球约有300种手语正在使用,例如美国手语(ASL)、中国手语(CSL)、印度手语(ISL)等。手语依赖于特定地区的口语语言。与口语不同,手语中不存在诸如is、am、are、was、were、will、be等辅助性词汇。由于仅有限人群熟练掌握手语,这种认知缺失阻碍了听障人士与大众自由顺畅地交流。手语识别系统能够将手语转化为口语,从而解决这一问题。本文提出一种基于长短期记忆(LSTM)深度学习模型的连续手语识别系统,并在印度手语基础数据集上进行训练与测试。该数据集通过MediaPipe Holistic流程构建,用于追踪面部、手部及身体动作并采集关键点坐标。系统实现了实时手势识别,准确率达到88.23%。