A Deep Reinforcement Learning Approach to Automated Stock Trading, using xLSTM Networks

Traditional Long Short-Term Memory (LSTM) networks are effective for handling sequential data but have limitations such as gradient vanishing and difficulty in capturing long-term dependencies, which can impact their performance in dynamic and risky environments like stock trading. To address these limitations, this study explores the usage of the newly introduced Extended Long Short Term Memory (xLSTM) network in combination with a deep reinforcement learning (DRL) approach for automated stock trading. Our proposed method utilizes xLSTM networks in both actor and critic components, enabling effective handling of time series data and dynamic market environments. Proximal Policy Optimization (PPO), with its ability to balance exploration and exploitation, is employed to optimize the trading strategy. Experiments were conducted using financial data from major tech companies over a comprehensive timeline, demonstrating that the xLSTM-based model outperforms LSTM-based methods in key trading evaluation metrics, including cumulative return, average profitability per trade, maximum earning rate, maximum pullback, and Sharpe ratio. These findings mark the potential of xLSTM for enhancing DRL-based stock trading systems.

翻译：传统长短期记忆（LSTM）网络在处理序列数据方面具有有效性，但存在梯度消失和难以捕捉长期依赖关系等局限性，这会影响其在股票交易等动态高风险环境中的表现。为解决这些局限，本研究探索了将新引入的扩展长短期记忆（xLSTM）网络与深度强化学习（DRL）方法相结合，用于自动股票交易。我们提出的方法在actor和critic组件中均采用xLSTM网络，从而有效处理时间序列数据和动态市场环境。利用能够平衡探索与利用的近端策略优化（PPO）算法来优化交易策略。实验采用主要科技公司跨越综合时间维度的金融数据，结果表明，基于xLSTM的模型在累计回报、单笔交易平均盈利、最大收益率、最大回撤和夏普比率等关键交易评估指标上均优于基于LSTM的方法。这些发现标志着xLSTM在增强基于DRL的股票交易系统方面具有潜力。

相关内容

长短期记忆

关注 142

长短期记忆（LSTM）是一种用于深度学习领域的人工递归神经网络（RNN）架构。与标准前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点（例如图像），而且可以处理整个数据序列（例如语音或视频）。例如，LSTM适用于诸如未分段的连接手写识别，语音识别和网络流量或IDS（入侵检测系统）中的异常检测之类的任务。常见的LSTM单元由单元，输入门，输出门和忘记门组成。单元会记住任意时间间隔内的值，并且三个门控制着进出单元的信息流。LSTM网络非常适合基于时间序列数据进行分类，处理和做出预测，因为时间序列中重要事件之间可能存在未知持续时间的滞后。开发LSTM是为了解决训练传统RNN时可能遇到的梯度消失问题。与缝隙长度相对不敏感是LSTM在众多应用中优于RNN，隐马尔可夫模型和其他序列学习方法的优势。

《可解释深度强化学习综述》

专知会员服务

40+阅读 · 2025年2月12日

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

专知会员服务

79+阅读 · 2022年12月11日