A key missing capacity of current language models (LMs) is grounding to real-world environments. Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects. It thereby casts the burden of ensuring grammaticality, faithfulness, and controllability all on the LMs. We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. Pangu consists of a symbolic agent and a neural LM working in a concerted fashion: The agent explores the environment to incrementally construct valid plans, and the LM evaluates the plausibility of the candidate plans to guide the search process. A case study on the challenging problem of knowledge base question answering (KBQA), which features a massive environment, demonstrates the remarkable effectiveness and flexibility of Pangu: A BERT-base LM is sufficient for setting a new record on standard KBQA datasets, and larger LMs further bring substantial gains. Pangu also enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.
翻译:当前语言模型(LMs)的一个关键缺失能力是锚定到现实环境。大多数现有的具象语言理解工作使用LMs直接生成可在环境中执行以实现预期效果的规划。这因而将确保语法正确性、忠实性和可控性的负担全部转嫁给了LMs。我们提出Pangu,一个利用LMs判别能力而非生成能力的具象语言理解通用框架。Pangu由一个符号化代理和一个神经LM协同工作:代理探索环境以逐步构建有效规划,LM评估候选规划的合理性以引导搜索过程。在具有大规模环境挑战性的知识库问答(KBQA)问题上的案例研究表明,Pangu具有显著的效力和灵活性:一个BERT-base LM就足以在标准KBQA数据集上创下新纪录,而更大的LMs进一步带来显著提升。Pangu还首次实现了大型LMs(如Codex)在KBQA中的高效少样本上下文学习。