Rapid advancements in Large Language models (LLMs) has significantly enhanced their reasoning capabilities. Despite improved performance on benchmarks, LLMs exhibit notable gaps in their cognitive processes. Additionally, as reflections of human-generated data, these models have the potential to inherit cognitive biases, raising concerns about their reasoning and decision making capabilities. In this paper we present a framework to interpret, understand and provide insights into a host of cognitive biases in LLMs. Conducting our research on frontier language models we're able to elucidate reasoning limitations and biases, and provide reasoning behind these biases by constructing influence graphs that identify phrases and words most responsible for biases manifested in LLMs. We further investigate biases such as round number bias and cognitive bias barrier revealed when noting framing effect in language models.
翻译:大语言模型(LLMs)的快速发展显著提升了其推理能力。尽管在基准测试中性能有所改进,但LLMs在其认知过程中仍表现出明显的缺陷。此外,作为人类生成数据的反映,这些模型有可能继承认知偏差,从而引发对其推理和决策能力的担忧。本文提出一个框架,用于解释、理解并深入探究LLMs中的一系列认知偏差。通过对前沿语言模型进行研究,我们能够阐明其推理局限性和偏差,并通过构建影响图来识别导致LLMs中显现偏差的最关键短语和词汇,从而为这些偏差提供解释依据。我们进一步研究了语言模型中出现的框架效应所揭示的整数偏差和认知偏差壁垒等偏差现象。