This paper presents a comprehensive study on the use of ensemble Reinforcement Learning (RL) models in financial trading strategies, leveraging classifier models to enhance performance. By combining RL algorithms such as A2C, PPO, and SAC with traditional classifiers like Support Vector Machines (SVM), Decision Trees, and Logistic Regression, we investigate how different classifier groups can be integrated to improve risk-return trade-offs. The study evaluates the effectiveness of various ensemble methods, comparing them with individual RL models across key financial metrics, including Cumulative Returns, Sharpe Ratios (SR), Calmar Ratios, and Maximum Drawdown (MDD). Our results demonstrate that ensemble methods consistently outperform base models in terms of risk-adjusted returns, providing better management of drawdowns and overall stability. However, we identify the sensitivity of ensemble performance to the choice of variance threshold τ, highlighting the importance of dynamic τ adjustment to achieve optimal performance. This study emphasizes the value of combining RL with classifiers for adaptive decision-making, with implications for financial trading, robotics, and other dynamic environments.
翻译:本文对金融交易策略中集成强化学习模型的应用进行了综合研究,利用分类器模型提升性能。通过将A2C、PPO、SAC等强化学习算法与支持向量机、决策树和逻辑回归等传统分类器相结合,我们探讨了如何整合不同分类器组以改善风险-收益权衡。本研究评估了多种集成方法的有效性,并将其与单一强化学习模型在累计收益、夏普比率、卡尔玛比率和最大回撤等关键金融指标上进行了比较。结果表明,集成方法在风险调整后收益方面持续优于基础模型,能更有效地管理回撤并提升整体稳定性。然而,我们注意到集成性能对方差阈值τ的选择具有敏感性,这凸显了实现动态τ调整以达到最优性能的重要性。本研究强调了将强化学习与分类器相结合用于自适应决策的价值,对金融交易、机器人及其他动态环境具有启示意义。