The possibilities of robot control have multiplied across various domains through the application of deep reinforcement learning. To overcome safety and sampling efficiency issues, deep reinforcement learning models can be trained in a simulation environment, allowing for faster iteration cycles. This can be enhanced further by parallelizing the training process using GPUs. NVIDIA's open-source robot learning framework Orbit leverages this potential by wrapping tensor-based reinforcement learning libraries for high parallelism and building upon Isaac Sim for its simulations. We contribute a detailed description of the implementation of a benchmark reinforcement learning task, namely box pushing, using Orbit. Additionally, we benchmark the performance of our implementation in comparison to a CPU-based implementation and report the performance metrics. Finally, we tune the hyper parameters of our implementation and show that we can generate significantly more samples in the same amount of time by using Orbit.
翻译:通过深度强化学习的应用,机器人控制的可能性已在多个领域倍增。为克服安全性和采样效率问题,深度强化学习模型可在仿真环境中训练,从而实现更快的迭代周期。通过使用GPU并行化训练过程可进一步强化这一优势。英伟达的开源机器人学习框架Orbit利用此潜力,通过封装基于张量的强化学习库实现高并行性,并基于Isaac Sim构建其仿真系统。我们针对使用Orbit实现基准强化学习任务(即推箱子任务)提供详细描述。此外,我们对比了该实现与基于CPU实现的性能基准,并报告了性能指标。最后,我们调整了实现的超参数,并证明通过使用Orbit,我们能在相同时间内生成显著更多的样本。