Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building in-context exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query. In effect, we are able to achieve highly competitive results for few-shot DST on MultiWOZ.
翻译:基于提示的方法借助大规模预训练语言模型已在许多自然语言处理任务中展现出惊人的无辅助性能。通过添加少量带标签的上下文示例来指导输出生成,这些模型性能进一步提升。然而,对于对话状态跟踪这类复杂任务,设计能可靠传达预期意图的提示并非易事,常导致结果不稳定。此外,为对话任务构建上下文示例存在困难,因为对话语境较长而模型输入长度相对较短。为克服这些问题,我们首先将元学习方案适配至对话领域,使模型在不同提示下均能保持稳定性能。其次,我们设计了一种新颖的训练方法,以改进传统检索机制获取理想上下文示例的能力。最后,我们引入显著性模型限制对话文本长度,从而在每个查询中纳入更多示例。最终,我们在MultiWOZ数据集上的少样本对话状态跟踪任务中取得了极具竞争力的结果。