We introduce \textsc{Pok\'eLLMon}, the first LLM-embodied agent that achieves human-parity performance in tactical battle games, as demonstrated in Pok\'emon battles. The design of \textsc{Pok\'eLLMon} incorporates three key strategies: (i) In-context reinforcement learning that instantly consumes text-based feedback derived from battles to iteratively refine the policy; (ii) Knowledge-augmented generation that retrieves external knowledge to counteract hallucination and enables the agent to act timely and properly; (iii) Consistent action generation to mitigate the \textit{panic switching} phenomenon when the agent faces a powerful opponent and wants to elude the battle. We show that online battles against human demonstrates \textsc{Pok\'eLLMon}'s human-like battle strategies and just-in-time decision making, achieving 49\% of win rate in the Ladder competitions and 56\% of win rate in the invited battles. Our implementation and playable battle logs are available at: \url{https://github.com/git-disl/PokeLLMon}.
翻译:我们介绍\textsc{PokéLLMon},这是首个在战术对战游戏中(以宝可梦对战为例)达到人类水平性能的LLM具身智能体。该智能体设计包含三项关键策略:(i)上下文强化学习,能即时利用对战文本反馈迭代优化策略;(ii)知识增强生成,通过检索外部知识抑制幻觉并使智能体及时恰当行动;(iii)一致性动作生成,缓解智能体面对强敌意图逃避时的"恐慌切换"现象。在线人类对战表明,\textsc{PokéLLMon}展现出类人的对战策略与实时决策能力,在天梯赛中取得49%胜率,邀请赛取得56%胜率。实现代码与可对战日志已开源至:\url{https://github.com/git-disl/PokeLLMon}。