Reinforcement learning is able to obtain generalized low-level robot policies on diverse robotics datasets in embodied learning scenarios, and Transformer has been widely used to model time-varying features. However, it still suffers from the issues of low data efficiency and high inference latency. In this paper, we propose to investigate the task from a new perspective of the frequency domain. We first observe that the energy density in the frequency domain of a robot's trajectory is mainly concentrated in the low-frequency part. Then, we present the Fourier Controller Network (FCNet), a new network that utilizes the Short-Time Fourier Transform (STFT) to extract and encode time-varying features through frequency domain interpolation. We further achieve parallel training and efficient recurrent inference by using FFT and Sliding DFT methods in the model architecture for real-time decision-making. Comprehensive analyses in both simulated (e.g., D4RL) and real-world environments (e.g., robot locomotion) demonstrate FCNet's substantial efficiency and effectiveness over existing methods such as Transformer, e.g., FCNet outperforms Transformer on multi-environmental robotics datasets of all types of sizes (from 1.9M to 120M). The project page and code can be found https://thkkk.github.io/fcnet.
翻译:在具身学习场景中,强化学习能够从多样化的机器人数据集中获得泛化的低层机器人策略,而Transformer已被广泛用于建模时变特征。然而,该方法仍存在数据效率低和推理延迟高的问题。本文提出从频域这一新视角来研究该任务。我们首先观察到,机器人轨迹在频域的能量密度主要集中于低频部分。随后,我们提出了傅里叶控制器网络(FCNet),这是一种利用短时傅里叶变换(STFT)通过频域插值来提取和编码时变特征的新型网络。我们进一步通过在模型架构中使用快速傅里叶变换(FFT)和滑动离散傅里叶变换(Sliding DFT)方法,实现了并行训练和高效的循环推理,以支持实时决策。在模拟环境(例如D4RL)和真实世界环境(例如机器人运动)中的综合分析表明,FCNet相较于Transformer等现有方法具有显著的高效性和有效性。例如,FCNet在各类规模(从190万到1.2亿)的多环境机器人数据集上均优于Transformer。项目页面和代码可见 https://thkkk.github.io/fcnet。