Recent advances in Deep Learning and Computer Vision have been successfully leveraged to serve marginalized communities in various contexts. One such area is Sign Language - a primary means of communication for the deaf community. However, so far, the bulk of research efforts and investments have gone into American Sign Language, and research activity into low-resource sign languages - especially Bangla Sign Language - has lagged significantly. In this research paper, we present a new word-level Bangla Sign Language dataset - BdSL40 - consisting of 611 videos over 40 words, along with two different approaches: one with a 3D Convolutional Neural Network model and another with a novel Graph Neural Network approach for the classification of BdSL40 dataset. This is the first study on word-level BdSL recognition, and the dataset was transcribed from Indian Sign Language (ISL) using the Bangla Sign Language Dictionary (1997). The proposed GNN model achieved an F1 score of 89%. The study highlights the significant lexical and semantic similarity between BdSL, West Bengal Sign Language, and ISL, and the lack of word-level datasets for BdSL in the literature. We release the dataset and source code to stimulate further research.
翻译:深度学习和计算机视觉的最新进展已成功应用于服务不同背景下的边缘化群体,手语作为听障社区的主要沟通方式即为其一。然而,迄今为止,绝大多数研究投入集中在美式手语,而针对低资源手语——特别是孟加拉手语——的研究活动严重滞后。本文提出了一个新的词级孟加拉手语数据集BdSL40,包含40个词的611段视频,并设计了两种分类方法:基于3D卷积神经网络模型的方法,以及基于新型图神经网络的方法。这是词级BdSL识别的首项研究,该数据集依据《孟加拉手语词典(1997)》从印度手语转录而来。所提出的GNN模型取得了89%的F1分数。研究揭示了BdSL、西孟加拉手语与印度手语之间存在显著词汇与语义相似性,同时指出文献中缺乏词级BdSL数据集。我们公开了数据集与源代码以促进后续研究。