Time series forecasting plays a crucial role in diverse fields, necessitating the development of robust models that can effectively handle complex temporal patterns. In this article, we present a novel feature selection method embedded in Long Short-Term Memory networks, leveraging a multi-objective evolutionary algorithm. Our approach optimizes the weights and biases of the LSTM in a partitioned manner, with each objective function of the evolutionary algorithm targeting the root mean square error in a specific data partition. The set of non-dominated forecast models identified by the algorithm is then utilized to construct a meta-model through stacking-based ensemble learning. Furthermore, our proposed method provides an avenue for attribute importance determination, as the frequency of selection for each attribute in the set of non-dominated forecasting models reflects their significance. This attribute importance insight adds an interpretable dimension to the forecasting process. Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the generalization ability of conventional LSTMs, effectively reducing overfitting. Comparative analyses against state-of-the-art CancelOut and EAR-FS methods highlight the superior performance of our approach.
翻译:时间序列预测在多个领域发挥着关键作用,需要开发能够有效处理复杂时间模式的稳健模型。本文提出一种新型的长短期记忆网络嵌入式特征选择方法,该方法利用多目标进化算法进行优化。我们的方法以分区方式优化LSTM的权重和偏置,进化算法的每个目标函数针对特定数据分区中的均方根误差进行优化。算法识别出的非支配预测模型集合随后通过堆叠集成学习构建元模型。此外,该方法还能进行属性重要性判断:非支配预测模型集合中各属性的选择频率反映了其重要程度,这种属性重要性视角为预测过程增加了可解释性维度。在意大利和西班牙东南部空气质量时间序列数据上的实验评估表明,本方法显著提升了传统LSTM的泛化能力,有效降低了过拟合。与当前最先进的CancelOut和EAR-FS方法对比分析显示,我们的方法具有更优性能。