In this work, we explore the use of hierarchical reinforcement learning (HRL) for the task of temporal sequence prediction. Using a combination of deep learning and HRL, we develop a stock agent to predict temporal price sequences from historical stock price data and a vehicle agent to predict steering angles from first person, dash cam images. Our results in both domains indicate that a type of HRL, called feudal reinforcement learning, provides significant improvements to training speed and stability and prediction accuracy over standard RL. A key component to this success is the multi-resolution structure that introduces both temporal and spatial abstraction into the network hierarchy.
翻译:本研究探索了分层强化学习(HRL)在时间序列预测任务中的应用。通过结合深度学习与HRL,我们分别构建了股票代理——基于历史股价数据预测时间价格序列,以及车辆代理——基于第一人称行车记录仪图像预测转向角度。两个领域的实验结果表明,一种被称为封建强化学习的HRL变体,在训练速度、稳定性及预测精度上均显著优于标准强化学习。这一成功的关键在于引入多分辨率结构,该结构在网络层级中同时实现了时间与空间抽象。