This study presents a benchmark for evaluating action-constrained reinforcement learning (RL) algorithms. In action-constrained RL, each action taken by the learning system must comply with certain constraints. These constraints are crucial for ensuring the feasibility and safety of actions in real-world systems. We evaluate existing algorithms and their novel variants across multiple robotics control environments, encompassing multiple action constraint types. Our evaluation provides the first in-depth perspective of the field, revealing surprising insights, including the effectiveness of a straightforward baseline approach. The benchmark problems and associated code utilized in our experiments are made available online at github.com/omron-sinicx/action-constrained-RL-benchmark for further research and development.
翻译:本研究提出了一个评估动作约束强化学习算法的基准测试。在动作约束强化学习中,学习系统执行的每个动作必须满足特定约束。这些约束对于确保现实系统中动作的可行性和安全性至关重要。我们在多个机器人控制环境中评估了现有算法及其创新变体,涵盖了多种动作约束类型。评估为该领域提供了首个深入视角,揭示了令人惊讶的发现,包括一种简单基线方法的有效性。实验中使用的基准问题及相关代码已在线公开于github.com/omron-sinicx/action-constrained-RL-benchmark,以供进一步研究与开发。