Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are typically acquired in deeper layers. Specifically, we categorize concepts based on their level of abstraction, defining them in the order of increasing complexity within factual, emotional, and inferential tasks. We conduct extensive probing experiments using layer-wise representations across various LLM families (Gemma, LLaMA, Qwen) on various datasets spanning the three domains of tasks. Our findings reveal that models could efficiently conduct probing for simpler tasks in shallow layers, and more complex tasks typically necessitate deeper layers for accurate understanding. Additionally, we examine how external factors, such as adding noise to the input and quantizing the model weights, might affect layer-wise representations. Our findings suggest that these factors can impede the development of a conceptual understanding of LLMs until deeper layers are explored. We hope that our proposed concept and experimental insights will enhance the understanding of the mechanisms underlying LLMs. Our codes are available at \url{https://github.com/Luckfort/CD}.
翻译:大型语言模型(LLMs)在广泛的任务中展现出卓越的性能。然而,这些模型如何编码不同复杂度的任务,其机制仍不甚明了。本文探讨以下假设:LLMs在不同层级处理不同复杂度的概念,并提出“概念深度”这一概念,以表明更复杂的概念通常在更深层级获得。具体而言,我们根据概念的抽象程度对其进行分类,并按照事实性、情感性和推理性任务中复杂度递增的顺序进行定义。我们在涵盖上述三类任务领域的多个数据集上,对多种LLM系列(Gemma、LLaMA、Qwen)的分层表征进行了广泛的探测实验。研究发现,模型能够在浅层有效执行较简单任务的探测,而更复杂的任务通常需要更深层级以实现准确理解。此外,我们探究了外部因素(如对输入添加噪声和对模型权重进行量化)如何影响分层表征。结果表明,这些因素可能阻碍LLMs概念理解能力的发展,直至探索到更深层级。我们希望提出的概念和实验发现能增进对LLMs底层机制的理解。代码发布于\url{https://github.com/Luckfort/CD}。