Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes. However, a challenge that often arises is the trade-off between both the computational efficiency of the decision-making process and the ability of the learned agent to solve a particular task. This is particularly critical in real-time settings such as video games where the agent needs to take relevant decisions at a very high frequency, with a very limited inference time. In this work, we propose a generic offline learning approach where the computation cost of the input features is taken into account. We derive the Budgeted Decision Transformer as an extension of the Decision Transformer that incorporates cost constraints to limit its cost at inference. As a result, the model can dynamically choose the best input features at each timestep. We demonstrate the effectiveness of our method on several tasks, including D4RL benchmarks and complex 3D environments similar to those found in video games, and show that it can achieve similar performance while using significantly fewer computational resources compared to classical approaches.
翻译:深度强化学习(DRL)技术已日益广泛用于各领域的决策过程。然而,一个常见挑战在于决策过程的计算效率与习得智能体解决特定任务能力之间的权衡。这在实时场景(如电子游戏)中尤为关键——智能体需以极高频率在极有限的推理时间内做出相关决策。本文提出一种通用的离线学习方法,该方法将输入特征的计算成本纳入考量。我们推导出预算约束决策变换器(Budgeted Decision Transformer),作为决策变换器(Decision Transformer)的扩展,通过引入成本约束限制其推理阶段的资源消耗。由此,模型可在每个时间步动态选择最优输入特征。我们通过多项任务(包括D4RL基准测试及类似电子游戏的复杂三维环境)验证了方法的有效性,结果表明该方法在显著减少计算资源消耗的同时,性能与经典方法相当。