The unpredictability and volatility of the stock market render it challenging to make a substantial profit using any generalised scheme. Many previous studies tried different techniques to build a machine learning model, which can make a significant profit in the US stock market by performing live trading. However, very few studies have focused on the importance of finding the best features for a particular trading period. Our top approach used the performance to narrow down the features from a total of 148 to about 30. Furthermore, the top 25 features were dynamically selected before each time training our machine learning model. It uses ensemble learning with four classifiers: Gaussian Naive Bayes, Decision Tree, Logistic Regression with L1 regularization, and Stochastic Gradient Descent, to decide whether to go long or short on a particular stock. Our best model performed daily trade between July 2011 and January 2019, generating 54.35% profit. Finally, our work showcased that mixtures of weighted classifiers perform better than any individual predictor of making trading decisions in the stock market.
翻译:股票市场具有不可预测性和波动性,使得任何泛化方案都难以获得大幅盈利。以往多项研究尝试了不同技术,通过实盘交易构建能够在美国股市显著盈利的机器学习模型。然而,针对特定交易周期寻找最优特征重要性的研究却非常有限。我们的首要方法利用性能指标将特征从总计148个缩减至约30个。此外,在每次训练机器学习模型前,动态选取前25个特征。该方法采用集成学习策略,结合四种分类器:高斯朴素贝叶斯、决策树、L1正则化逻辑回归及随机梯度下降,以决定对特定股票做多或做空。最佳模型执行2011年7月至2019年1月的每日交易,创造了54.35%的收益。最后,我们的工作表明,加权分类器的混合模型在股票市场交易决策中的表现优于任何单一预测器。