Mutagenesis screen to map the functions of parameters of Large Language Models

Large Language Models (LLMs) have significantly advanced artificial intelligence, excelling in numerous tasks. Although the functionality of a model is inherently tied to its parameters, a systematic method for exploring the connections between the parameters and the functionality are lacking. Models sharing similar structure and parameter counts exhibit significant performance disparities across various tasks, prompting investigations into the varying patterns that govern their performance. We adopted a mutagenesis screen approach inspired by the methods used in biological studies, to investigate Llama2-7b and Zephyr. This technique involved mutating elements within the models' matrices to their maximum or minimum values to examine the relationship between model parameters and their functionalities. Our research uncovered multiple levels of fine structures within both models. Many matrices showed a mixture of maximum and minimum mutations following mutagenesis, but others were predominantly sensitive to one type. Notably, mutations that produced phenotypes, especially those with severe outcomes, tended to cluster along axes. Additionally, the location of maximum and minimum mutations often displayed a complementary pattern on matrix in both models, with the Gate matrix showing a unique two-dimensional asymmetry after rearrangement. In Zephyr, certain mutations consistently resulted in poetic or conversational rather than descriptive outputs. These "writer" mutations grouped according to the high-frequency initial word of the output, with a marked tendency to share the row coordinate even when they are in different matrices. Our findings affirm that the mutagenesis screen is an effective tool for deciphering the complexities of large language models and identifying unexpected ways to expand their potential, providing deeper insights into the foundational aspects of AI systems.

翻译：大语言模型（LLMs）显著推动了人工智能的发展，在众多任务中表现出色。尽管模型的功能本质上与其参数相关联，但目前缺乏系统性的方法来探索参数与功能之间的联系。具有相似结构和参数数量的模型在不同任务中表现出显著的性能差异，这促使我们研究影响其性能的多样化模式。我们借鉴生物学研究中使用的诱变筛选方法，对Llama2-7b和Zephyr模型进行了研究。该技术通过将模型矩阵中的元素突变至其最大值或最小值，来探究模型参数与其功能之间的关系。我们的研究揭示了两种模型中存在的多层次精细结构。许多矩阵在诱变后显示出最大值与最小值突变的混合分布，而其他矩阵则主要对某一类突变敏感。值得注意的是，产生表型（尤其是具有严重后果的表型）的突变往往沿轴线聚集分布。此外，两种模型中最大值与最小值突变的位置在矩阵上常呈现互补模式，其中Gate矩阵在重排后表现出独特的二维不对称性。在Zephyr模型中，某些突变持续导致输出呈现诗意化或对话式风格，而非描述性输出。这些“写作者”突变根据输出文本的高频起始词进行分组，并且即使位于不同矩阵中，它们也明显倾向于共享行坐标。我们的研究结果证实，诱变筛选是解析大语言模型复杂性、发掘拓展其潜力的意外途径的有效工具，为理解人工智能系统的基础特性提供了更深入的见解。