Language models trained on large text corpora encode rich distributional information about real-world environments and action sequences. This information plays a crucial role in current approaches to language processing tasks like question answering and instruction generation. We describe how to leverage language models for *non-linguistic* perception and control tasks. Our approach casts labeling and decision-making as inference in probabilistic graphical models in which language models parameterize prior distributions over labels, decisions and parameters, making it possible to integrate uncertain observations and incomplete background knowledge in a principled way. Applied to semantic segmentation, household navigation, and activity recognition tasks, this approach improves predictions on rare, out-of-distribution, and structurally novel inputs.
翻译:在大规模文本语料库上训练的语言模型,编码了关于真实世界环境与动作序列的丰富分布信息。这类信息在当前自然语言处理任务(如问答和指令生成)的方法中发挥着关键作用。我们提出了一种将语言模型应用于*非语言*感知与控制任务的方法。该方法将标注与决策问题转化为概率图模型中的推断过程:其中语言模型为标签、决策及参数化先验分布,从而能够以原则性方式整合不确定观测与不完整背景知识。在语义分割、家庭导航和活动识别任务上的实验表明,该方法有效提升了罕见、分布外及结构新颖输入的预测性能。