Adversarial Transformer Language Models for Contextual Commonsense Inference

from arxiv, Submitted to Semantic Web Journal special edition. https://semantic-web-journal.org/content/adversarial-transformer-language-models-contextual-commonsense-inference-1

Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions (i.e., facts) from a given story, and a particular sentence from that story. Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training; and, possibly, hallucinated or false facts. In this work, we utilize a transformer model for this task and develop techniques to address the aforementioned problems in the task. We control the inference by introducing a new technique we call "hinting". Hinting is a kind of language model prompting, that utilizes both hard prompts (specific words) and soft prompts (virtual learnable templates). This serves as a control signal to advise the language model "what to talk about". Next, we establish a methodology for performing joint inference with multiple commonsense knowledge bases. Joint inference of commonsense requires care, because it is imprecise and the level of generality is more flexible. You want to be sure that the results "still make sense" for the context. To this end, we align the textual version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and GLUCOSE) with a story and a target sentence. This combination allows us to train a single model to perform joint inference with multiple knowledge graphs. We show experimental results for the three knowledge graphs on joint inference. Our final contribution is exploring a GAN architecture that generates the contextualized commonsense assertions and scores them as to their plausibility through a discriminator. The result is an integrated system for contextual commonsense inference in stories, that can controllably generate plausible commonsense assertions, and takes advantage of joint inference between multiple commonsense knowledge bases.

翻译：上下文化或语篇感知的常识推理任务是从给定故事及故事中的特定句子生成连贯的常识性断言（即事实）。该任务面临以下问题：对推断事实主题缺乏可控性；训练过程中缺少常识知识；以及可能产生幻觉或虚假事实。在本工作中，我们采用Transformer模型解决此任务，并开发相应技术应对上述问题。通过引入称为"提示引导"的新技术来控制推理过程。提示引导是一种语言模型提示方法，同时利用硬性提示（具体词语）和软性提示（虚拟可学习模板），作为指导语言模型"谈论什么"的控制信号。其次，我们建立了结合多个常识知识库进行联合推理的方法体系。常识联合推理需要谨慎处理，因为其本身具有不精确性且概括层次更灵活——必须确保结果在上下文中"依然合理"。为此，我们将来自三个知识图谱（ConceptNet、ATOMIC2020和GLUCOSE）的文本化断言与故事及目标句子对齐，使单一模型能够基于多个知识图谱进行联合推理训练。我们展示了三个知识图谱的联合推理实验结果。最后的贡献在于探索生成对抗网络架构：该架构生成上下文化常识性断言，并通过判别器对断言合理性进行评分。最终构建出完整的故事情景常识推理系统，既能可控生成合理的常识断言，又能利用多个常识知识库的联合推理优势。