Recent studies demonstrate that diffusion planners benefit from sparse-step planning over single-step planning. Training models to skip steps in their trajectories helps capture long-term dependencies without additional memory or computational cost. However, predicting excessively sparse plans degrades performance. We hypothesize this temporal density threshold is non-uniform across a planning horizon and that certain parts of a predicted trajectory should be more densely generated. We propose Mixed-Density Diffuser (MDD), a diffusion planner where the densities throughout the horizon are tunable hyperparameters. We show that MDD surpasses the SOTA Diffusion Veteran (DV) framework across the Maze2D, Franka Kitchen, and Antmaze Datasets for Deep Data-Driven Reinforcement Learning (D4RL) task domains, achieving a new SOTA on the D4RL benchmark.
翻译:近期研究表明,扩散规划器通过稀疏步长规划相较于单步规划具有显著优势。训练模型以跳过轨迹中的某些步骤,有助于捕捉长期依赖关系,且无需额外的内存或计算成本。然而,预测过度稀疏的规划会降低性能。我们假设这一时间密度阈值在整个规划时域内并非均匀分布,且预测轨迹的某些部分应当以更高密度生成。为此,我们提出混合密度扩散器(MDD),这是一种扩散规划器,其在整个时域内的密度分布可作为可调超参数。实验表明,在深度数据驱动强化学习(D4RL)任务领域的Maze2D、Franka Kitchen和Antmaze数据集上,MDD均超越了当前最先进的扩散规划框架Diffusion Veteran(DV),并在D4RL基准测试中创造了新的最优性能记录。