The financial markets, which involve more than $90 trillion market capitals, attract the attention of innumerable investors around the world. Recently, reinforcement learning in financial markets (FinRL) has emerged as a promising direction to train agents for making profitable investment decisions. However, the evaluation of most FinRL methods only focuses on profit-related measures and ignores many critical axes, which are far from satisfactory for financial practitioners to deploy these methods into real-world financial markets. Therefore, we introduce PRUDEX-Compass, which has 6 axes, i.e., Profitability, Risk-control, Universality, Diversity, rEliability, and eXplainability, with a total of 17 measures for a systematic evaluation. Specifically, i) we propose AlphaMix+ as a strong FinRL baseline, which leverages mixture-of-experts (MoE) and risk-sensitive approaches to make diversified risk-aware investment decisions, ii) we evaluate 8 FinRL methods in 4 long-term real-world datasets of influential financial markets to demonstrate the usage of our PRUDEX-Compass, iii) PRUDEX-Compass together with 4 real-world datasets, standard implementation of 8 FinRL methods and a portfolio management environment is released as public resources to facilitate the design and comparison of new FinRL methods. We hope that PRUDEX-Compass can not only shed light on future FinRL research to prevent untrustworthy results from stagnating FinRL into successful industry deployment but also provide a new challenging algorithm evaluation scenario for the reinforcement learning (RL) community.
翻译:摘要:涉及超过90万亿美元市值的金融市场吸引了全球无数投资者的关注。近年来,金融市场的强化学习(FinRL)已成为训练智能体进行盈利投资决策的一个有前景方向。然而,大多数FinRL方法的评估仅关注利润相关指标,忽略了众多关键维度,远不能满足金融从业者将这些方法部署到真实金融市场的需求。为此,我们提出PRUDEX-Compass,包含6个维度(即盈利性、风险控制、普适性、多样性、可靠性及可解释性)共17项指标,用于系统性评估。具体而言:(i) 我们提出AlphaMix+作为强力的FinRL基线方法,其利用混合专家(MoE)和风险敏感方法做出多样化且具有风险意识的投资决策;(ii) 我们在4个有影响力的金融市场的长期真实数据集上评估了8种FinRL方法,以展示PRUDEX-Compass的实用性;(iii) 我们将PRUDEX-Compass、4个真实数据集、8种FinRL方法的标准实现及一个投资组合管理环境作为公共资源发布,以促进新FinRL方法的设计与比较。我们希望PRUDEX-Compass不仅能照亮未来FinRL研究,防止不可信结果阻碍FinRL成功落地工业界,还能为强化学习社区提供新的具有挑战性的算法评估场景。