Humans understand language by extracting information (meaning) from sentences, combining it with existing commonsense knowledge, and then performing reasoning to draw conclusions. While large language models (LLMs) such as GPT-3 and ChatGPT are able to leverage patterns in the text to solve a variety of NLP tasks, they fall short in problems that require reasoning. They also cannot reliably explain the answers generated for a given question. In order to emulate humans better, we propose STAR, a framework that combines LLMs with Answer Set Programming (ASP). We show how LLMs can be used to effectively extract knowledge -- represented as predicates -- from language. Goal-directed ASP is then employed to reliably reason over this knowledge. We apply the STAR framework to three different NLU tasks requiring reasoning: qualitative reasoning, mathematical reasoning, and goal-directed conversation. Our experiments reveal that STAR is able to bridge the gap of reasoning in NLU tasks, leading to significant performance improvements, especially for smaller LLMs, i.e., LLMs with a smaller number of parameters. NLU applications developed using the STAR framework are also explainable: along with the predicates generated, a justification in the form of a proof tree can be produced for a given output.
翻译:人类通过从句子中提取信息(即意义),并将其与已有的常识知识相结合,然后进行推理以得出结论的方式来理解语言。尽管像GPT-3和ChatGPT这样的大型语言模型(LLMs)能够利用文本中的模式来解决多种自然语言处理任务,但它们在需要推理能力的问题上表现不足,且无法可靠地解释针对给定问题生成的答案。为了更贴近人类的理解方式,我们提出了STAR框架,该框架将大型语言模型与回答集编程(ASP)相结合。我们展示了如何利用大型语言模型从语言中有效提取表示为谓词的知识,然后采用目标导向的ASP来可靠地基于这些知识进行推理。我们将STAR框架应用于三种需要推理能力的自然语言理解任务:定性推理、数学推理和目标导向对话。实验结果表明,STAR能够弥补自然语言理解任务中的推理差距,带来显著的性能提升,尤其是对于参数规模较小的LLMs。基于STAR框架开发的自然语言理解应用还具有可解释性:除了生成的谓词外,还可以为给定输出提供以证明树形式呈现的推理依据。