Diffusion Models as Optimizers for Efficient Planning in Offline RL

Diffusion models have shown strong competitiveness in offline reinforcement learning tasks by formulating decision-making as sequential generation. However, the practicality of these methods is limited due to the lengthy inference processes they require. In this paper, we address this problem by decomposing the sampling process of diffusion models into two decoupled subprocesses: 1) generating a feasible trajectory, which is a time-consuming process, and 2) optimizing the trajectory. With this decomposition approach, we are able to partially separate efficiency and quality factors, enabling us to simultaneously gain efficiency advantages and ensure quality assurance. We propose the Trajectory Diffuser, which utilizes a faster autoregressive model to handle the generation of feasible trajectories while retaining the trajectory optimization process of diffusion models. This allows us to achieve more efficient planning without sacrificing capability. To evaluate the effectiveness and efficiency of the Trajectory Diffuser, we conduct experiments on the D4RL benchmarks. The results demonstrate that our method achieves $\it 3$-$\it 10 \times$ faster inference speed compared to previous sequence modeling methods, while also outperforming them in terms of overall performance. https://github.com/RenMing-Huang/TrajectoryDiffuser Keywords: Reinforcement Learning and Efficient Planning and Diffusion Model

翻译：扩散模型通过将决策过程建模为序列生成，在离线强化学习任务中展现出强大的竞争力。然而，这些方法因需要冗长的推理过程而限制了其实用性。本文通过将扩散模型的采样过程分解为两个解耦的子过程来解决此问题：1）生成可行轨迹（这是一个耗时的过程），以及2）优化轨迹。借助这种分解方法，我们能够部分分离效率与质量因素，从而同时获得效率优势并确保质量保证。我们提出了轨迹扩散器（Trajectory Diffuser），它利用更快的自回归模型来处理可行轨迹的生成，同时保留扩散模型的轨迹优化过程。这使得我们能够在不牺牲能力的前提下实现更高效的规划。为了评估轨迹扩散器的有效性与效率，我们在D4RL基准测试上进行了实验。结果表明，与先前的序列建模方法相比，我们的方法实现了$\it 3$至$\it 10$倍的推理速度提升，同时在整体性能上也优于它们。https://github.com/RenMing-Huang/TrajectoryDiffuser 关键词：强化学习与高效规划与扩散模型

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/