We introduce PokeLLMon, the first LLM-embodied agent that achieves human-parity performance in tactical battle games, as demonstrated in Pokemon battles. The design of PokeLLMon incorporates three key strategies: (i) In-context reinforcement learning that instantly consumes text-based feedback derived from battles to iteratively refine the policy; (ii) Knowledge-augmented generation that retrieves external knowledge to counteract hallucination and enables the agent to act timely and properly; (iii) Consistent action generation to mitigate the panic switching phenomenon when the agent faces a powerful opponent and wants to elude the battle. We show that online battles against human demonstrates PokeLLMon's human-like battle strategies and just-in-time decision making, achieving 49% of win rate in the Ladder competitions and 56% of win rate in the invited battles. Our implementation and playable battle logs are available at: \url{https://github.com/git-disl/PokeLLMon}.
翻译:我们提出PokeLLMon,这是首个在战术对战游戏中达到人类水平表现的、由大型语言模型驱动的智能体,并在宝可梦对战中进行了验证。PokeLLMon的设计包含三个关键策略:(i)上下文强化学习——即时利用对战中的文本反馈迭代优化策略;(ii)知识增强生成——检索外部知识以对抗幻觉,使智能体能够及时且合理地行动;(iii)一致行动生成——缓解当智能体面对强大对手试图逃避对战时产生的恐慌切换现象。实验表明,与人类进行线上对战的结果显示,PokeLLMon展现了类人的对战策略和实时决策能力,在排位赛中达到49%的胜率,在邀请赛中达到56%的胜率。我们的实现代码及可试玩的对战记录已开源:\url{https://github.com/git-disl/PokeLLMon}。