Due to the rapid dynamics and a mass of uncertainties in the quantitative markets, the issue of how to take appropriate actions to make profits in stock trading remains a challenging one. Reinforcement learning (RL), as a reward-oriented approach for optimal control, has emerged as a promising method to tackle this strategic decision-making problem in such a complex financial scenario. In this paper, we integrated two prior financial trading strategies named constant proportion portfolio insurance (CPPI) and time-invariant portfolio protection (TIPP) into multi-agent deep deterministic policy gradient (MADDPG) and proposed two specifically designed multi-agent RL (MARL) methods: CPPI-MADDPG and TIPP-MADDPG for investigating strategic trading in quantitative markets. Afterward, we selected 100 different shares in the real financial market to test these specifically proposed approaches. The experiment results show that CPPI-MADDPG and TIPP-MADDPG approaches generally outperform the conventional ones.
翻译:在量化市场的快速动态和大量不确定性中,如何在股票交易中采取适当行动以获取利润仍然是一个具有挑战性的问题。强化学习作为一种面向奖励的最优控制方法,已成为解决这种复杂金融场景下策略决策问题的有前景的手段。本文我们将两种先前的金融交易策略——恒定比例投资组合保险和时间不变投资组合保护——整合到多智能体深度确定性策略梯度中,并提出了两种专门设计的多智能体强化学习方法:CPPI-MADDPG和TIPP-MADDPG,用于研究量化市场中的策略交易。随后,我们选取真实金融市场中的100只不同股票来测试这些特定提出的方法。实验结果表明,CPPI-MADDPG和TIPP-MADDPG方法通常优于传统方法。