Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose iSign: a benchmark for Indian Sign Language (ISL) Processing. We make three primary contributions to this work. First, we release one of the largest ISL-English datasets with more than 118K video-sentence/phrase pairs. To the best of our knowledge, it is the largest sign language dataset available for ISL. Second, we propose multiple NLP-specific tasks (including SignVideo2Text, SignPose2Text, Text2Pose, Word Prediction, and Sign Semantics) and benchmark them with the baseline models for easier access to the research community. Third, we provide detailed insights into the proposed benchmarks with a few linguistic insights into the workings of ISL. We streamline the evaluation of Sign Language processing, addressing the gaps in the NLP research community for Sign Languages. We release the dataset, tasks, and models via the following website: https://exploration-lab.github.io/iSign/
翻译:印度手语在开发用于自动化语言处理的机器学习与数据驱动方法方面资源有限。尽管基于文本/音频的语言处理技术在近年来已展现出巨大的研究价值并取得显著进展,但由于资源匮乏,手语研究仍亟待跟进。为弥合这一差距,本研究提出iSign:一个面向印度手语处理的基准数据集。我们在此工作中作出三项主要贡献:首先,我们发布了目前规模最大的ISL-英语数据集之一,包含超过11.8万个视频-句子/短语对。据我们所知,这是当前可用的最大规模印度手语数据集。其次,我们设计了多项自然语言处理专项任务(包括手语视频转文本、手语姿态转文本、文本转姿态、词汇预测及手语语义理解),并通过基线模型进行基准测试,以降低研究社区的使用门槛。第三,我们结合对印度手语运作机制的语言学观察,对所提基准任务进行了深度解析。本研究通过系统化构建手语处理评估体系,致力于填补自然语言处理研究社区在手语研究领域的空白。相关数据集、任务与模型已通过以下网站发布:https://exploration-lab.github.io/iSign/