数字红皇后：基于大语言模型的Core War对抗性程序演化 (Digital Red Queen: Adversarial Program Evolution in Core War with LLMs)

Large language models (LLMs) are increasingly being used to evolve solutions to problems in many domains, in a process inspired by biological evolution. However, unlike biological evolution, most LLM-evolution frameworks are formulated as static optimization problems, overlooking the open-ended adversarial dynamics that characterize real-world evolutionary processes. Here, we study Digital Red Queen (DRQ), a simple self-play algorithm that embraces these so-called "Red Queen" dynamics via continual adaptation to a changing objective. DRQ uses an LLM to evolve assembly-like programs, called warriors, which compete against each other for control of a virtual machine in the game of Core War, a Turing-complete environment studied in artificial life and connected to cybersecurity. In each round of DRQ, the model evolves a new warrior to defeat all previous ones, producing a sequence of adapted warriors. Over many rounds, we observe that warriors become increasingly general (relative to a set of held-out human warriors). Interestingly, warriors also become less behaviorally diverse across independent runs, indicating a convergence pressure toward a general-purpose behavioral strategy, much like convergent evolution in nature. This result highlights a potential value of shifting from static objectives to dynamic Red Queen objectives. Our work positions Core War as a rich, controllable sandbox for studying adversarial adaptation in artificial systems and for evaluating LLM-based evolution methods. More broadly, the simplicity and effectiveness of DRQ suggest that similarly minimal self-play approaches could prove useful in other more practical multi-agent adversarial domains, like real-world cybersecurity or combating drug resistance.

翻译：大语言模型正日益被用于演化解决诸多领域的问题，这一过程受到生物演化的启发。然而与生物演化不同，大多数基于大语言模型的演化框架被构建为静态优化问题，忽视了现实世界演化过程中具有的开放式对抗动态特性。本文研究数字红皇后算法——一种通过持续适应变化目标来体现所谓"红皇后"动态的简易自博弈算法。该算法利用大语言模型演化类汇编程序（称为战士程序），这些程序在Core War游戏中相互竞争以争夺虚拟机的控制权。Core War是一个图灵完备的环境，在人工生命研究中被广泛探讨并与网络安全领域相关联。在数字红皇后算法的每一轮迭代中，模型演化出新战士程序以击败所有先前版本，从而产生一系列适应性战士。经过多轮演化，我们观察到战士程序相对于一组保留的人类战士程序变得日益通用。有趣的是，在不同独立运行中，战士程序的行为多样性反而降低，这表明存在向通用行为策略的收敛压力，类似于自然界的趋同演化现象。这一结果凸显了从静态目标转向动态红皇后目标的潜在价值。我们的研究将Core War定位为一个丰富可控的沙箱环境，可用于研究人工系统中的对抗性适应机制，并评估基于大语言模型的演化方法。更广泛而言，数字红皇后算法的简洁性与有效性表明，类似的极简自博弈方法在其他更具实践意义的多智能体对抗领域（如现实网络安全或应对耐药性问题）可能同样具有应用价值。