Semi-Supervised Spoken Language Glossification

Spoken language glossification (SLG) aims to translate the spoken language text into the sign language gloss, i.e., a written record of sign language. In this work, we present a framework named $S$emi-$S$upervised $S$poken $L$anguage $G$lossification ($S^3$LG) for SLG. To tackle the bottleneck of limited parallel data in SLG, our $S^3$LG incorporates large-scale monolingual spoken language text into SLG training. The proposed framework follows the self-training structure that iteratively annotates and learns from pseudo labels. Considering the lexical similarity and syntactic difference between sign language and spoken language, our $S^3$LG adopts both the rule-based heuristic and model-based approach for auto-annotation. During training, we randomly mix these complementary synthetic datasets and mark their differences with a special token. As the synthetic data may be less quality, the $S^3$LG further leverages consistency regularization to reduce the negative impact of noise in the synthetic data. Extensive experiments are conducted on public benchmarks to demonstrate the effectiveness of the $S^3$LG. Our code is available at \url{https://github.com/yaohj11/S3LG}.

翻译：口语语言注释化（SLG）旨在将口语文本翻译为手语注释，即手语的书面记录。本文提出了一种名为$S$emi-$S$upervised $S$poken $L$anguage $G$lossification（$S^3$LG）的框架用于SLG。为应对SLG中平行数据有限的瓶颈，我们的$S^3$LG将大规模单语口语文本纳入SLG训练。所提框架遵循自训练结构，通过迭代标注和从伪标签中学习。考虑到手语与口语之间的词汇相似性和句法差异，我们的$S^3$LG同时采用基于规则的启发式方法和基于模型的方法进行自动标注。训练过程中，我们随机混合这些互补的合成数据集，并用特殊标记区分其差异。由于合成数据质量可能较低，$S^3$LG进一步利用一致性正则化来降低合成数据中噪声的负面影响。在公开基准上进行了大量实验，证明了$S^3$LG的有效性。我们的代码已开源在\url{https://github.com/yaohj11/S3LG}。

相关内容

百度智能生活事业群组

关注 0

百度智能生活事业群组（Smart Living Group，简称 SLG）成立于 2018 年 3 月，其前身是成立于 2017 年 2 月的度秘事业部。SLG 全面负责百度对话式人工智能操作系统小度助手（DuerOS）产品与技术创新，加速对话式人工智能领域的产品落地，是百度人工智能战略的重要组成部分。 SLG 的使命是打造无处不在的人工智能个人助手服务，用人工智能让人和设备的交互更自然，让生活更简单美好。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日