Pre-trained language models (PLMs) have made remarkable progress in table-to-text generation tasks. However, the topological gap between tabular data and text and the lack of domain-specific knowledge make it difficult for PLMs to produce faithful text, especially in real-world applications with limited resources. In this paper, we mitigate the above challenges by introducing a novel augmentation method: Prompt-based Adapter (PA), which targets table-to-text generation under few-shot conditions. The core insight design of the PA is to inject prompt templates for augmenting domain-specific knowledge and table-related representations into the model for bridging the structural gap between tabular data and descriptions through adapters. Such prompt-based knowledge augmentation method brings at least two benefits: (1) enables us to fully use the large amounts of unlabelled domain-specific knowledge, which can alleviate the PLMs' inherent shortcomings of lacking domain knowledge; (2) allows us to design different types of tasks supporting the generative challenge. Extensive experiments and analyses are conducted on three open-domain few-shot NLG datasets: Humans, Books, and Songs. Compared to previous state-of-the-art approaches, our model achieves superior performance in terms of both fluency and accuracy as judged by human and automatic evaluations.
翻译:预训练语言模型(PLMs)在表格到文本生成任务中取得了显著进展。然而,表格数据与文本之间的拓扑鸿沟以及领域特定知识的缺乏,使得PLMs难以生成忠实的文本,特别是在资源受限的实际应用中。本文通过引入一种新颖的增强方法——基于提示的适配器(PA),来缓解上述挑战,该方法针对少样本条件下的表格到文本生成。PA的核心设计思路是向模型中注入用于增强领域特定知识和表格相关表示的提示模板,并通过适配器弥合表格数据与描述之间的结构鸿沟。这种基于提示的知识增强方法至少带来两个好处:(1)使我们能够充分利用大量未标注的领域特定知识,从而缓解PLMs固有的领域知识缺乏问题;(2)允许我们设计不同类型的任务来支持生成性挑战。我们在三个开放域少样本自然语言生成数据集(Humans、Books和Songs)上进行了广泛的实验和分析。与先前的先进方法相比,我们的模型在人类评估和自动评估中,在流畅性和准确性方面均取得了更优的表现。