Text-based games (TGs) are language-based interactive environments for reinforcement learning. While language models (LMs) and knowledge graphs (KGs) are commonly used for handling large action space in TGs, it is unclear whether these techniques are necessary or overused. In this paper, we revisit the challenge of exploring the action space in TGs and propose $ \epsilon$-admissible exploration, a minimal approach of utilizing admissible actions, for training phase. Additionally, we present a text-based actor-critic (TAC) agent that produces textual commands for game, solely from game observations, without requiring any KG or LM. Our method, on average across 10 games from Jericho, outperforms strong baselines and state-of-the-art agents that use LM and KG. Our approach highlights that a much lighter model design, with a fresh perspective on utilizing the information within the environments, suffices for an effective exploration of exponentially large action spaces.
翻译:文本游戏(TGs)是基于语言交互的强化学习环境。尽管语言模型(LMs)和知识图谱(KGs)被广泛用于处理文本游戏中的大规模动作空间,但这些技术是否必要或过度使用尚不明确。本文重新审视文本游戏中动作空间探索的挑战,提出$\epsilon$-可容许探索法——一种利用可容许动作的最小化训练方法。同时,我们提出基于文本的演员-评论家(TAC)智能体,仅通过游戏观察生成文本指令,无需依赖任何知识图谱或语言模型。在Jericho平台10个游戏的测试中,我们的方法平均性能优于使用语言模型与知识图谱的强基线模型及最先进智能体。这一结果表明,通过重新审视环境信息利用方式,更轻量化的模型设计足以有效探索指数级增长的动作空间。