This paper introduces a new paradigm for AI game programming, leveraging large language models (LLMs) to extend and operationalize Claude Shannon's taxonomy of game-playing machines. Central to this paradigm is Nemobot, an interactive agentic engineering environment that enables users to create, customize, and deploy LLM-powered game agents while actively engaging with AI-driven strategies. The LLM-based chatbot, integrated within Nemobot, demonstrates its capabilities across four distinct classes of games. For dictionary-based games, it compresses state-action mappings into efficient, generalized models for rapid adaptability. In rigorously solvable games, it employs mathematical reasoning to compute optimal strategies and generates human-readable explanations for its decisions. For heuristic-based games, it synthesizes strategies by combining insights from classical minimax algorithms (see, e.g., shannon1950chess) with crowd-sourced data. Finally, in learning-based games, it utilizes reinforcement learning with human feedback and self-critique to iteratively refine strategies through trial-and-error and imitation learning. Nemobot amplifies this framework by offering a programmable environment where users can experiment with tool-augmented generation and fine-tuning of strategic game agents. From strategic games to role-playing games, Nemobot demonstrates how AI agents can achieve a form of self-programming by integrating crowdsourced learning and human creativity to iteratively refine their own logic. This represents a step toward the long-term goal of self-programming AI.
翻译:本文提出了一种全新的 AI 游戏编程范式,该范式利用大语言模型(LLM)来扩展并实践克劳德·香农关于博弈机器的分类法。该范式的核心是 Nemobot,一个交互式智能体工程环境,使用户能够创建、定制和部署由大语言模型驱动的游戏智能体,并积极参与到 AI 驱动的策略中。集成在 Nemobot 中的基于 LLM 的聊天机器人,在四类不同的游戏中展示了其能力。对于基于词典的游戏,它将状态-动作映射压缩成高效、通用的模型以实现快速适应。在严格可解的游戏中,它运用数学推理来计算最优策略,并为其决策生成可读的自然语言解释。对于基于启发式的游戏,它通过结合经典极小极大算法(例如,参见 shannon1950chess)的见解与众包数据来综合制定策略。最后,在基于学习的游戏中,它利用基于人类反馈及自我批判的强化学习,通过试错和模仿学习来迭代优化策略。Nemobot 通过提供一个可编程环境来增强该框架,用户可在其中尝试使用工具增强生成和微调战略游戏智能体。从策略游戏到角色扮演游戏,Nemobot 展示了 AI 智能体如何通过整合众包学习和人类创造力,以迭代优化自身逻辑的方式,实现一种形式的自我编程。这代表了向实现自我编程 AI 这一长期目标迈出的一步。