Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are typically acquired in deeper layers. Specifically, we categorize concepts based on their level of abstraction, defining them in the order of increasing complexity within factual, emotional, and inferential tasks. We conduct extensive probing experiments using layer-wise representations across various LLM families (Gemma, LLaMA, QWen) on various datasets spanning the three domains of tasks. Our findings reveal that models could efficiently conduct probing for simpler tasks in shallow layers, and more complex tasks typically necessitate deeper layers for accurate understanding. Additionally, we examine how external factors, such as adding noise to the input and quantizing the model weights, might affect layer-wise representations. Our findings suggest that these factors can impede the development of a conceptual understanding of LLMs until deeper layers are explored. We hope that our proposed concept and experimental insights will enhance the understanding of the mechanisms underlying LLMs. Our codes are available at https://github.com/Luckfort/CD.
翻译:大型语言模型(LLMs)在广泛的任务中展现出卓越性能,然而,这些模型如何编码不同复杂度的任务仍鲜为人知。本文探索了LLMs在不同层级处理不同复杂度概念的假设,引入“概念深度”这一理念,认为更复杂的概念通常是在更深层中获取的。具体而言,我们根据抽象程度对概念进行分类,在事实性、情感性和推理性任务中按复杂度递增顺序进行定义。我们利用多种LLM系列(Gemma、LLaMA、QWen)在涵盖这三个任务领域的多个数据集上,使用层级表征进行了广泛的探测实验。研究结果表明,模型对于简单任务能在浅层有效进行探测,而更复杂的任务通常需要更深的层级才能准确理解。此外,我们考察了诸如向输入添加噪声和对模型权重进行量化等外部因素如何影响层级表征。我们的发现表明,这些因素会阻碍LLMs概念理解的发展,直到探索至更深层。希望我们提出的概念和实验见解能增进对LLMs底层机制的理解。我们的代码发布于https://github.com/Luckfort/CD。