Designing a board game demands both thinking as a designer and experiencing as a player, while iterating through repeated prototyping and playtesting cycles, making it a cognitively intensive creative task well suited for human-AI collaboration. However, current systems lack end-to-end support to guide designers through the complete workflow from vague early ideation to iterative rulebook revision and audience testing. To this end, we present AutoBG, a board game design assistant built around critic-driven iterative refinement, comprising four specialized modules: BG-Ideator guides designers via multi-turn dialogue to produce structured design drafts; BG-Realizer generates complete rulebooks from drafts and revises them in a closed loop with BG-Critic, which diagnoses design flaws and gates each revision so that only verified improvements are accepted; and BG-Persona simulates individualized feedback from 150 real player profiles. Together, these modules enable designers to go from an initial idea to a polished, audience-tested rulebook within a single integrated workflow. The system is built on 2.2K structured rulebooks and 180K quality-filtered real player reviews, with task-specific training data derived for each module. Experiments on 207 held-out games show that AutoBG substantially outperforms state-of-the-art baselines (e.g., GPT-5.4), generating rulebooks that approach the quality of published games. Furthermore, a user study with 30 participants across diverse experience levels confirms that AutoBG effectively reduces blank-page anxiety, surfaces hidden design flaws, and provides highly rated, practical assistance throughout the creative process.
翻译:摘要:设计一款桌游既需要以设计师身份进行思考,又需要以玩家身份进行体验,同时需经历反复的原型制作与测试迭代循环,这使得它成为一项认知密集型的创意任务,非常适合人机协作。然而,现有系统缺乏端到端的支持,无法引导设计师完成从模糊的早期构思到迭代规则完善及受众测试的完整工作流程。为此,我们提出AutoBG——一款围绕批评驱动迭代优化构建的桌游设计助手,包含四个专用模块:BG-Ideator通过多轮对话引导设计师生成结构化设计草案;BG-Realizer从草案生成完整规则书,并与BG-Critic形成闭环迭代优化,后者能诊断设计缺陷并控制每次修订的准入条件,确保仅采纳经验证有效的改进;BG-Persona则模拟150个真实玩家档案提供个性化反馈。这些模块协同工作,使设计师能在单一集成工作流中,从初始灵感直达经过打磨与受众测试的规则书。该系统基于2200份结构化规则书和18万条质量过滤后的真实玩家评论构建,并为每个模块衍生出任务特异性训练数据。针对207款留出游戏的实验表明,AutoBG显著优于当前最优基线(如GPT-5.4),生成的规则书质量已接近已出版游戏。此外,一项覆盖30名不同经验水平参与者的用户研究证实,AutoBG能有效缓解空白页焦虑、揭示隐藏设计缺陷,并在整个创意过程中提供高评价的实用支持。