Sign language recognition (SLR) has recently achieved a breakthrough in performance thanks to deep neural networks trained on large annotated sign datasets. Of the many different sign languages, these annotated datasets are only available for a select few. Since acquiring gloss-level labels on sign language videos is difficult, learning by transferring knowledge from existing annotated sources is useful for recognition in under-resourced sign languages. This study provides a publicly available cross-dataset transfer learning benchmark from two existing public Turkish SLR datasets. We use a temporal graph convolution-based sign language recognition approach to evaluate five supervised transfer learning approaches and experiment with closed-set and partial-set cross-dataset transfer learning. Experiments demonstrate that improvement over finetuning based transfer learning is possible with specialized supervised transfer learning methods.
翻译:手语识别(SLR)近年来得益于在大规模标注手语数据集上训练的深度神经网络,取得了性能上的突破。在众多不同手语中,这些标注数据集仅对少数几种语言可用。由于对手语视频获取词汇级标签较为困难,从现有标注源迁移知识进行学习,对于资源匮乏的手语识别而言十分有用。本研究基于两个现有的公开土耳其语SLR数据集,提供了一个公开可用的跨数据集迁移学习基准。我们采用基于时序图卷积的手语识别方法,评估了五种有监督迁移学习方法,并针对闭集和部分集的跨数据集迁移学习进行了实验。实验结果表明,通过专门的有监督迁移学习方法,可以实现优于基于微调的迁移学习的性能提升。