Prior studies on the emergence in large models have primarily focused on how the functional capabilities of large language models (LLMs) scale with model size. Our research, however, transcends this traditional paradigm, aiming to deepen our understanding of the emergence within LLMs by placing a special emphasis not just on the model size but more significantly on the complex behavior of neuron interactions during the training process. By introducing the concepts of "self-organization" and "multifractal analysis," we explore how neuron interactions dynamically evolve during training, leading to "emergence," mirroring the phenomenon in natural systems where simple micro-level interactions give rise to complex macro-level behaviors. To quantitatively analyze the continuously evolving interactions among neurons in large models during training, we propose the Neuron-based Multifractal Analysis (NeuroMFA). Utilizing NeuroMFA, we conduct a comprehensive examination of the emergent behavior in LLMs through the lens of both model size and training process, paving new avenues for research into the emergence in large models.
翻译:先前关于大模型涌现的研究主要关注大型语言模型功能能力随模型规模扩展的规律。然而,本研究突破这一传统范式,旨在深化对大语言模型涌现机制的理解——不仅关注模型规模,更着重揭示训练过程中神经元交互的复杂行为。通过引入"自组织"与"多重分形分析"概念,我们探究了训练过程中神经元交互的动态演化如何催生"涌现"现象,这一机制与自然系统中微观层面简单交互衍生出宏观层面复杂行为的现象相呼应。为定量分析大模型训练中神经元间持续演变的交互关系,我们提出了基于神经元的多重分形分析方法(NeuroMFA)。借助该方法,我们从模型规模与训练过程双重视角系统审视了大语言模型的涌现行为,为大型模型涌现研究开辟了新路径。