Grounded language models use external sources of information, such as knowledge graphs, to meet some of the general challenges associated with pre-training. By extending previous work on compositional generalization in semantic parsing, we allow for a controlled evaluation of the degree to which these models learn and generalize from patterns in knowledge graphs. We develop a procedure for generating natural language questions paired with knowledge graphs that targets different aspects of compositionality and further avoids grounding the language models in information already encoded implicitly in their weights. We evaluate existing methods for combining language models with knowledge graphs and find them to struggle with generalization to sequences of unseen lengths and to novel combinations of seen base components. While our experimental results provide some insight into the expressive power of these models, we hope our work and released datasets motivate future research on how to better combine language models with structured knowledge representations.
翻译:接地语言模型利用外部信息源(如知识图谱)来应对预训练中的一些常见挑战。通过扩展语义解析中组合泛化的先前研究,我们实现了对这些模型从知识图谱模式中学习与泛化程度的可控评估。我们开发了一种生成与知识图谱配对的自然语言问题的流程,该流程针对组合性的不同方面,并进一步避免将语言模型接地到其权重中已隐含编码的信息上。我们评估了现有将语言模型与知识图谱结合的方法,发现它们在泛化到未见长度序列以及已见基础组件的新颖组合时存在困难。尽管我们的实验结果提供了对这些模型表达能力的一些见解,但我们希望我们的工作和已发布的数据集能够激励未来关于如何更好地将语言模型与结构化知识表示相结合的研究。