Sign Language Translation with Sentence Embedding Supervision

State-of-the-art sign language translation (SLT) systems facilitate the learning process through gloss annotations, either in an end2end manner or by involving an intermediate step. Unfortunately, gloss labelled sign language data is usually not available at scale and, when available, gloss annotations widely differ from dataset to dataset. We present a novel approach using sentence embeddings of the target sentences at training time that take the role of glosses. The new kind of supervision does not need any manual annotation but it is learned on raw textual data. As our approach easily facilitates multilinguality, we evaluate it on datasets covering German (PHOENIX-2014T) and American (How2Sign) sign languages and experiment with mono- and multilingual sentence embeddings and translation systems. Our approach significantly outperforms other gloss-free approaches, setting the new state-of-the-art for data sets where glosses are not available and when no additional SLT datasets are used for pretraining, diminishing the gap between gloss-free and gloss-dependent systems.

翻译：当前最先进的手语翻译系统通常通过手语注释（gloss annotations）来辅助学习过程，其实现方式可以是端到端的，也可以包含中间处理步骤。然而，带有手语注释标注的大规模手语数据通常难以获得，且即使存在，不同数据集之间的手语注释标注方式也往往存在显著差异。本文提出一种新颖的方法，在训练时利用目标语句的句子嵌入来替代手语注释的作用。这种新型监督方式无需任何人工标注，而是直接从原始文本数据中学习得到。由于我们的方法易于实现多语言扩展，我们在涵盖德语手语（PHOENIX-2014T）和美国手语（How2Sign）的数据集上进行了评估，并尝试了单语与多语言句子嵌入及翻译系统。实验表明，我们的方法显著优于其他无需手语注释的方法，在无手语注释可用且未使用额外手语翻译数据集进行预训练的情况下，为相关数据集确立了新的性能标杆，从而缩小了无需手语注释系统与依赖手语注释系统之间的性能差距。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日