In this paper, we address the challenges faced by Value Iteration Networks (VIN) in handling larger input maps and mitigating the impact of accumulated errors caused by increased iterations. We propose a novel approach, Value Iteration Networks with Gated Summarization Module (GS-VIN), which incorporates two main improvements: (1) employing an Adaptive Iteration Strategy in the Value Iteration module to reduce the number of iterations, and (2) introducing a Gated Summarization module to summarize the iterative process. The adaptive iteration strategy uses larger convolution kernels with fewer iteration times, reducing network depth and increasing training stability while maintaining the accuracy of the planning process. The gated summarization module enables the network to emphasize the entire planning process, rather than solely relying on the final global planning outcome, by temporally and spatially resampling the entire planning process within the VI module. We conduct experiments on 2D grid world path-finding problems and the Atari Mr. Pac-man environment, demonstrating that GS-VIN outperforms the baseline in terms of single-step accuracy, planning success rate, and overall performance across different map sizes. Additionally, we provide an analysis of the relationship between input size, kernel size, and the number of iterations in VI-based models, which is applicable to a majority of VI-based models and offers valuable insights for researchers and industrial deployment.
翻译:本文针对值迭代网络在处理更大规模输入地图时面临的挑战,以及由迭代次数增加导致累积误差加剧的问题,提出了一种新方法——带门控摘要模块的值迭代网络。该方法包含两个主要改进:(1)在值迭代模块中采用自适应迭代策略以减少迭代次数;(2)引入门控摘要模块对迭代过程进行总结。自适应迭代策略通过使用更大尺寸的卷积核并减少迭代次数,在保持规划过程精度的同时降低网络深度、提升训练稳定性。门控摘要模块通过对值迭代模块中的整个规划过程进行时空重采样,使网络能够关注完整的规划流程,而非仅依赖最终全局规划结果。我们在二维网格世界路径寻优问题及Atari Mr. Pac-man环境中进行了实验,结果表明,在不同地图尺寸下,GS-VIN在单步精度、规划成功率及整体性能方面均优于基线模型。此外,我们分析了基于值迭代的模型中输入尺寸、卷积核大小与迭代次数之间的关系,该分析适用于大多数基于值迭代的模型,为研究人员及工业部署提供了有价值的参考。