Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.
翻译:摘要:程序性规划,即将高层目标分解为一系列按时间顺序排列的步骤,是机器面临的一项重要而复杂的任务。它涉及整合常识知识,以推理通常具有反事实性的复杂情境化场景,例如"在没有电话的情况下安排医生预约"。尽管当前方法使用大型语言模型(LLM)取得了令人鼓舞的结果,但这些方法受到成本高昂的API调用和可重复性问题等缺点的阻碍。在本文中,我们主张使用较小的语言模型进行规划。我们提出了PlaSma,这是一种新颖的双管齐下的方法,旨在赋予小型语言模型程序性知识和(反事实)规划能力。更具体地说,我们开发了符号化程序性知识蒸馏技术,以增强小型语言模型中的隐含知识,并设计了一种推理时算法,以促进更结构化、更准确的推理。此外,我们引入了一个新任务——反事实规划,该任务要求修改计划以应对反事实情况。在原始设置和反事实设置中,我们都表明,规模小几个数量级的模型(参数规模770M-11B)能够与更大的教师模型的能力竞争,并常常超越后者。