Feature Selection with Annealing for Forecasting Financial Time Series

Stock market and cryptocurrency forecasting is very important to investors as they aspire to achieve even the slightest improvement to their buy or hold strategies so that they may increase profitability. However, obtaining accurate and reliable predictions is challenging, noting that accuracy does not equate to reliability, especially when financial time-series forecasting is applied owing to its complex and chaotic tendencies. To mitigate this complexity, this study provides a comprehensive method for forecasting financial time series based on tactical input output feature mapping techniques using machine learning (ML) models. During the prediction process, selecting the relevant indicators is vital to obtaining the desired results. In the financial field, limited attention has been paid to this problem with ML solutions. We investigate the use of feature selection with annealing (FSA) for the first time in this field, and we apply the least absolute shrinkage and selection operator (Lasso) method to select the features from more than 1,000 candidates obtained from 26 technical classifiers with different periods and lags. Boruta (BOR) feature selection, a wrapper method, is used as a baseline for comparison. Logistic regression (LR), extreme gradient boosting (XGBoost), and long short-term memory (LSTM) are then applied to the selected features for forecasting purposes using 10 different financial datasets containing cryptocurrencies and stocks. The dependent variables consisted of daily logarithmic returns and trends. The mean-squared error for regression, area under the receiver operating characteristic curve, and classification accuracy were used to evaluate model performance, and the statistical significance of the forecasting results was tested using paired t-tests. Experiments indicate that the FSA algorithm increased the performance of ML models, regardless of problem type.

翻译：股票市场与加密货币预测对投资者至关重要，因为他们期望通过优化买入或持有策略来提升盈利能力，哪怕仅有微小改进。然而，获取准确且可靠的预测极具挑战性——值得注意的是，准确性并不等同于可靠性，尤其在金融时间序列预测中，因其具有复杂混沌的特性。为降低这种复杂性，本研究基于战术性输入输出特征映射技术，提出了一套综合性的机器学习模型金融时间序列预测方法。在预测过程中，筛选相关指标对获取理想结果至关重要。在金融领域，现有机器学习解决方案对此问题的关注有限。我们首次在该领域引入退火特征选择（FSA）方法，并采用最小绝对收缩与选择算子（Lasso）从26个技术分类器（含不同周期与滞后项）生成的1000余个候选特征中进行特征筛选。以包装法Boruta特征选择（BOR）作为基准对比方法。随后将逻辑回归（LR）、极致梯度提升（XGBoost）与长短期记忆网络（LSTM）应用于所选特征，对包含加密货币与股票的10个金融数据集进行预测。因变量为日对数收益率与趋势指标。采用回归均方误差、受试者工作特征曲线下面积及分类准确率评估模型性能，并通过配对t检验验证预测结果的统计显著性。实验表明，无论问题类型如何，FSA算法均能提升机器学习模型的性能。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日