Large Language Models (LLMs) have demonstrated notable proficiency in both code generation and comprehension across multiple programming languages. However, the mechanisms underlying this proficiency remain underexplored, particularly with respect to whether distinct programming languages are processed independently or within a shared parametric region. Drawing an analogy to the specialized regions of the brain responsible for distinct cognitive functions, we introduce the concept of Coding Spot, a specialized parametric region within LLMs that facilitates coding capabilities. Our findings identify this Coding Spot and show that targeted modifications to this subset significantly affect performance on coding tasks, while largely preserving non-coding functionalities. This compartmentalization mirrors the functional specialization observed in cognitive neuroscience, where specific brain regions are dedicated to distinct tasks, suggesting that LLMs may similarly employ specialized parameter regions for different knowledge domains.
翻译:大型语言模型(LLMs)在多种编程语言的代码生成和理解方面已展现出显著能力。然而,支撑这种能力的机制仍未得到充分探索,特别是在不同编程语言是被独立处理还是在共享参数区域内处理的问题上。借鉴大脑中负责不同认知功能的专门区域,我们提出了“编码点”的概念,即LLMs内部一个专门促进编码能力的参数区域。我们的研究识别了这一编码点,并表明对该子集进行针对性修改会显著影响编码任务的性能,同时基本保留非编码功能。这种区隔化反映了认知神经科学中观察到的功能专门化现象,即特定大脑区域负责不同任务,这表明LLMs可能同样采用专门的参数区域来处理不同的知识领域。