Recently, large language models (LLMs) have achieved tremendous breakthroughs in the field of language processing, yet their mechanisms in processing multiple languages remain agnostic. Therefore, in this work we study the multilingual activation patterns of LLMs. By transforming the original Large Language Models (LLMs) into a Mixture of Experts (MoE) architecture, we analyze the expert activation patterns when processing various languages and demonstrate the connections of these activation patterns at the level of language families. We discover the existence of non-language-specific neurons as well as language-specific activation neurons. Further exploration even showcases that merely leveraging high-frequency activation neurons can accelerate inference while maintaining comparable performance. These findings shed light on the LLMs' multilingual processing mechanism, and are of significant importance in guiding the multilingual training and model pruning of LLMs.
翻译:近期,大语言模型(LLMs)在语言处理领域取得了巨大突破,但其处理多种语言的机制仍属未知。为此,本文研究了LLMs的多语言激活模式。通过将原始大语言模型转化为专家混合(MoE)架构,我们分析了处理不同语言时的专家激活模式,并展示了这些激活模式在语系层面的关联性。我们发现了非语言特异性神经元以及语言特异性激活神经元的存在。进一步探索甚至表明,仅利用高频激活神经元即可在保持相当性能的同时加速推理。这些发现揭示了LLMs的多语言处理机制,并对指导LLMs的多语言训练与模型剪枝具有重要意义。