Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data

Manually annotating fine-grained slot-value labels for task-oriented dialogue (ToD) systems is an expensive and time-consuming endeavour. This motivates research into slot-filling methods that operate with limited amounts of labelled data. Moreover, the majority of current work on ToD is based solely on text as the input modality, neglecting the additional challenges of imperfect automatic speech recognition (ASR) when working with spoken language. In this work, we propose a Knowledge-Aware Audio-Grounded generative slot-filling framework, termed KA2G, that focuses on few-shot and zero-shot slot filling for ToD with speech input. KA2G achieves robust and data-efficient slot filling for speech-based ToD by 1) framing it as a text generation task, 2) grounding text generation additionally in the audio modality, and 3) conditioning on available external knowledge (e.g. a predefined list of possible slot values). We show that combining both modalities within the KA2G framework improves the robustness against ASR errors. Further, the knowledge-aware slot-value generator in KA2G, implemented via a pointer generator mechanism, particularly benefits few-shot and zero-shot learning. Experiments, conducted on the standard speech-based single-turn SLURP dataset and a multi-turn dataset extracted from a commercial ToD system, display strong and consistent gains over prior work, especially in few-shot and zero-shot setups.

翻译：手动为任务导向型对话（ToD）系统标注细粒度槽值标签是一项昂贵且耗时的工作，这促使研究者探索如何在标注数据有限的情况下实现槽填充方法。此外，当前大多数ToD研究仅基于文本作为输入模态，忽略了处理语音时自动语音识别（ASR）不完善带来的额外挑战。本文提出一种名为KA2G的知识增强音频驱动生成式槽填充框架，该框架专注于语音输入下ToD的小样本和零样本槽填充任务。KA2G通过以下方式实现基于语音的ToD鲁棒且数据高效的槽填充：1）将其建模为文本生成任务，2）额外将文本生成锚定到音频模态，3）利用可用外部知识（如预定义的候选槽值列表）进行条件约束。实验表明，在KA2G框架中融合两种模态可提升对ASR错误的鲁棒性；通过指针生成器机制实现的KA2G知识感知槽值生成器尤其促进了小样本和零样本学习。在标准单轮语音SLURP数据集及从商业ToD系统中提取的多轮数据集上的实验显示，相较于现有工作，该方法取得了持续且显著的性能提升，尤其在少样本和零样本设置中表现突出。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日