Humans understand language by extracting information (meaning) from sentences, combining it with existing commonsense knowledge, and then performing reasoning to draw conclusions. While large language models (LLMs) such as GPT-3 and ChatGPT are able to leverage patterns in the text to solve a variety of NLP tasks, they fall short in problems that require reasoning. They also cannot reliably explain the answers generated for a given question. In order to emulate humans better, we propose STAR, a framework that combines LLMs with Answer Set Programming (ASP). We show how LLMs can be used to effectively extract knowledge -- represented as predicates -- from language. Goal-directed ASP is then employed to reliably reason over this knowledge. We apply the STAR framework to three different NLU tasks requiring reasoning: qualitative reasoning, mathematical reasoning, and goal-directed conversation. Our experiments reveal that STAR is able to bridge the gap of reasoning in NLU tasks, leading to significant performance improvements, especially for smaller LLMs, i.e., LLMs with a smaller number of parameters. NLU applications developed using the STAR framework are also explainable: along with the predicates generated, a justification in the form of a proof tree can be produced for a given output.
翻译:人类通过从句子中提取信息(意义)、结合现有的常识知识并进行推理以得出结论。尽管GPT-3和ChatGPT等大语言模型能够利用文本中的模式解决多种自然语言处理任务,但在需要推理的问题上仍存在不足,且无法可靠地解释针对特定问题生成的答案。为更好地模拟人类能力,我们提出STAR框架,该框架将大语言模型与回答集编程相结合。本文展示了如何利用大语言模型从语言中有效提取以谓词表示的知识,并通过目标导向的ASP对这类知识进行可靠推理。我们将STAR框架应用于三项需要推理的自然语言理解任务:定性推理、数学推理和目标导向对话。实验表明,STAR能够弥合自然语言理解任务中的推理鸿沟,显著提升性能表现,尤其对参数规模较小的大语言模型效果更为突出。基于STAR框架开发的自然语言理解应用具备可解释性:除生成谓词外,还可针对特定输出构建以证明树形式呈现的推理依据。