The social robot navigation is an open and challenging problem. In existing work, separate modules are used to capture spatial and temporal features, respectively. However, such methods lead to extra difficulties in improving the utilization of spatio-temporal features and reducing the conservative nature of navigation policy. In light of this, we present a spatio-temporal transformer-based policy optimization algorithm to enhance the utilization of spatio-temporal features, thereby facilitating the capture of human-robot interactions. Specifically, this paper introduces a gated embedding mechanism that effectively aligns the spatial and temporal representations by integrating both modalities at the feature level. Then Transformer is leveraged to encode the spatio-temporal semantic information, with hope of finding the optimal navigation policy. Finally, a combination of spatio-temporal Transformer and self-adjusting policy entropy significantly reduces the conservatism of navigation policies. Experimental results demonstrate the effectiveness of the proposed framework, where our method shows superior performance.
翻译:社交机器人导航是一个开放且具有挑战性的问题。现有工作通常分别使用独立模块捕获空间和时间特征,但此类方法在提升时空特征利用率及降低导航策略保守性方面存在额外困难。针对这一问题,我们提出了一种基于时空变换器的策略优化算法,通过增强时空特征利用率来促进人机交互的捕捉。具体而言,本文引入了一种门控嵌入机制,该机制在特征层面整合空间与时间模态,有效对齐时空表征。进而利用Transformer编码时空语义信息,以期寻找到最优导航策略。最后,时空Transformer与自适应策略熵的组合显著降低了导航策略的保守性。实验结果表明,所提框架具有优越性能,我们的方法展现出卓越效果。