Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values. The current alignment of AI systems primarily focuses on passively aligning LLMs through human intervention. However, agents possess characteristics like receiving environmental feedback and self-evolution, rendering the LLM alignment methods inadequate. In response, we propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent, which transforms agent alignment into a process of evolution and selection under the principle of survival of the fittest. In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation, while those inadequately aligned dwindle over time. Experimental results assessing the agents from multiple perspectives in aligning with social norms demonstrate that EvolutionaryAgent possesses the capability to align progressively better with the evolving social norms while maintaining its proficiency in general tasks. Effectiveness tests conducted on various open and closed-source LLMs as the foundation for agents also prove the applicability of our approach.
翻译:基于大语言模型的智能体正日益渗透到人类生产与生活的各个领域,使其与人类价值观对齐的重要性日益凸显。当前人工智能系统对齐主要关注通过人类干预被动对齐大语言模型。然而,智能体具备接收环境反馈和自主进化等特性,这使得大语言模型对齐方法存在不足。为此,我们提出一种名为EvolutionaryAgent的智能体进化与对齐框架,它将智能体对齐转化为遵循适者生存原则的进化与筛选过程。在持续演进的社会规范环境中,更能适应当前社会规范的智能体将获得更高的生存与繁衍概率,而适应性较差的智能体则会随时间推移逐渐减少。从多维度评估智能体与社会规范对齐效果的实验结果表明,EvolutionaryAgent在维持通用任务处理能力的同时,具备逐步与演进社会规范更好对齐的能力。基于多种开源和闭源大语言模型作为智能体基座的效能测试也验证了我们方法的适用性。