In this paper, we address the challenges faced by Value Iteration Networks (VIN) in handling larger input maps and mitigating the impact of accumulated errors caused by increased iterations. We propose a novel approach, Value Iteration Networks with Gated Summarization Module (GS-VIN), which incorporates two main improvements: (1) employing an Adaptive Iteration Strategy in the Value Iteration module to reduce the number of iterations, and (2) introducing a Gated Summarization module to summarize the iterative process. The adaptive iteration strategy uses larger convolution kernels with fewer iteration times, reducing network depth and increasing training stability while maintaining the accuracy of the planning process. The gated summarization module enables the network to emphasize the entire planning process, rather than solely relying on the final global planning outcome, by temporally and spatially resampling the entire planning process within the VI module. We conduct experiments on 2D grid world path-finding problems and the Atari Mr. Pac-man environment, demonstrating that GS-VIN outperforms the baseline in terms of single-step accuracy, planning success rate, and overall performance across different map sizes. Additionally, we provide an analysis of the relationship between input size, kernel size, and the number of iterations in VI-based models, which is applicable to a majority of VI-based models and offers valuable insights for researchers and industrial deployment.
翻译:本文针对值迭代网络(VIN)在处理更大规模输入地图时面临的挑战,以及因迭代次数增加导致的累积误差问题,提出了一种新方法——带门控摘要模块的值迭代网络(GS-VIN)。该方法包含两项主要改进:(1)在值迭代模块中采用自适应迭代策略以减少迭代次数,(2)引入门控摘要模块对迭代过程进行总结。自适应迭代策略使用更大的卷积核配合更少的迭代次数,在保持规划过程精度的同时,减小网络深度并提升训练稳定性。门控摘要模块通过对值迭代模块中整个规划过程进行时空重采样,使网络能够关注完整的规划过程,而非仅依赖最终的全局规划结果。我们在二维网格世界路径规划问题和Atari Mr. Pac-man环境中开展实验,结果表明GS-VIN在单步精度、规划成功率和不同地图尺寸下的整体性能均优于基线模型。此外,我们还分析了基于VI模型的输入尺寸、核大小与迭代次数之间的关系,该分析适用于大多数基于VI的模型,可为研究人员及工业部署提供重要参考。