Developing a reinforcement learning (RL) agent often involves identifying values for numerous parameters, covering the policy, reward function, environment, and agent-internal architecture. Since these parameters are interrelated in complex ways, optimizing them is a black-box problem that proves especially challenging for nonexperts. Although existing optimization-as-a-service platforms (e.g., Vizier and Optuna) can handle such problems, they are impractical for RL systems, since the need for manual user mapping of each parameter to distinct components makes the effort cumbersome. It also requires understanding of the optimization process, limiting the systems' application beyond the machine learning field and restricting access in areas such as cognitive science, which models human decision-making. To tackle these challenges, the paper presents AgentForge, a flexible low-code platform to optimize any parameter set across an RL system. Available at https://github.com/feferna/AgentForge, it allows an optimization problem to be defined in a few lines of code and handed to any of the interfaced optimizers. With AgentForge, the user can optimize the parameters either individually or jointly. The paper presents an evaluation of its performance for a challenging vision-based RL problem.
翻译:开发强化学习(RL)智能体通常涉及为众多参数确定取值,这些参数涵盖策略、奖励函数、环境以及智能体内部架构。由于这些参数以复杂方式相互关联,优化它们是一个黑盒问题,这对非专家而言尤其具有挑战性。尽管现有的优化即服务平台(例如 Vizier 和 Optuna)能够处理此类问题,但它们对于 RL 系统并不实用,因为需要用户手动将每个参数映射到不同组件,这使得工作变得繁琐。这还需要理解优化过程,限制了系统在机器学习领域之外的应用,并阻碍了其在认知科学等领域的可及性,而认知科学正是对人类决策进行建模的领域。为应对这些挑战,本文提出了 AgentForge,这是一个灵活的、用于优化 RL 系统中任意参数集的低代码平台。该平台可通过 https://github.com/feferna/AgentForge 获取,它允许用几行代码定义优化问题,并将其交给任何已集成的优化器处理。借助 AgentForge,用户可以单独或联合优化参数。本文针对一个具有挑战性的基于视觉的 RL 问题,对其性能进行了评估。