Evolutionary methods have long been useful for analysis and explanation in genetics, biology, ecology, and related fields. In this work, we extend these methods to neural networks, specifically large language models (LLMs), to better analyze and explain relationships among models. We show how relating weights to genotypes and output text to phenotypes can improve our understanding of model lineage, important datasets, the roles of different model layers, and visualization of model relationships. We demonstrate this in a controlled experiment, where our estimated evolutionary trees reliably recover the topology of the ground-truth training tree. We further identify the most important weight layers according to weight differences and show through phenotypic experiments that one training dataset appears to contribute more useful information than the others. Finally, we generate an unsupervised evolutionary tree of black-box foundation models. Throughout, we provide visualizations that support a clearer understanding of evolutionary relationships among LLMs.
翻译:进化方法长期以来在遗传学、生物学、生态学及相关领域用于分析和解释问题。在本研究中,我们将这些方法扩展到神经网络,特别是大语言模型(LLM),以更好地分析和解释模型之间的关系。我们展示了如何将权重关联到基因型、将输出文本关联到表型,从而提升对模型谱系、重要数据集、不同模型层的作用以及模型关系可视化的理解。我们在一个受控实验中进行了验证,实验所估计的进化树能够可靠地恢复真实训练树的拓扑结构。我们进一步根据权重差异识别出最重要的权重层,并通过表型实验表明,某个训练数据集贡献的有用信息似乎多于其他数据集。最后,我们生成了一个黑盒基础模型的无监督进化树。在整个过程中,我们提供了可视化结果,以支持更清晰地理解大语言模型之间的进化关系。