Pre-trained large language models (LLMs) have powerful capabilities for generating creative natural text. Evolutionary algorithms (EAs) can discover diverse solutions to complex real-world problems. Motivated by the common collective and directionality of text sequence generation and evolution, this paper illustrates the strong consistency of LLMs and EAs, which includes multiple one-to-one key characteristics: token embedding and genotype-phenotype mapping, position encoding and fitness shaping, position embedding and selection, attention and crossover, feed-forward neural network and mutation, model training and parameter update, and multi-task learning and multi-objective optimization. Based on this consistency perspective, existing coupling studies are analyzed, including evolutionary fine-tuning and LLM-enhanced EAs. Leveraging these insights, we outline a fundamental roadmap for future research in coupling LLMs and EAs, while highlighting key challenges along the way. The consistency not only reveals the evolution mechanism behind LLMs but also facilitates the development of evolved artificial agents that approach or surpass biological organisms.
翻译:预训练大语言模型(LLMs)具备生成创新自然文本的强大能力,进化算法(EAs)则能针对复杂现实问题发现多样化解决方案。受文本序列生成与进化过程共有的集体性与方向性启发,本文阐明了LLMs与EAs间存在的强一致性,这种一致性体现在多个关键特征的——对应关系上:词元嵌入与基因型-表型映射、位置编码与适应度塑造、位置嵌入与选择机制、注意力机制与交叉操作、前馈神经网络与变异操作、模型训练与参数更新、多任务学习与多目标优化。基于这一一致性视角,本文分析了现有耦合研究(包括进化微调与LLM增强型进化算法),并据此勾勒出LLMs与EAs耦合研究的未来基本路线图,同时指出沿途的关键挑战。这种一致性不仅揭示了LLMs背后的进化机制,更将推动开发接近乃至超越生物有机体的进化人工智能体。