Differentiable planning promises end-to-end differentiability and adaptivity. However, an issue prevents it from scaling up to larger-scale problems: they need to differentiate through forward iteration layers to compute gradients, which couples forward computation and backpropagation, and needs to balance forward planner performance and computational cost of the backward pass. To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks. We study the convergence stability, scalability, and efficiency of the proposed implicit version of VIN and its variants and demonstrate their superiorities on a range of planning tasks: 2D navigation, visual navigation, and 2-DOF manipulation in configuration space and workspace.
翻译:可微分规划承诺了端到端的可微性与自适应性。然而,一个问题阻碍了其向更大规模问题扩展:它们需要通过前向迭代层求取梯度,这耦合了前向计算与反向传播,并需要在前向规划器性能与反向传播计算成本之间取得平衡。为缓解这一问题,我们提出通过贝尔曼不动点方程进行微分,从而解耦值迭代网络及其变体的前向与反向传播,这实现了恒定的反向成本(在规划视界内)、灵活的前向预算,并有助于向大规模任务扩展。我们研究了所提出的隐式版本VIN及其变体的收敛稳定性、可扩展性和效率,并在一系列规划任务中展示了其优越性:二维导航、视觉导航以及配置空间与工作空间中的二自由度操作。