Time series forecasting (TSF) is one of the most important tasks in data science given the fact that accurate time series (TS) predictive models play a major role across a wide variety of domains including finance, transportation, health care, and power systems. Real-world utilization of machine learning (ML) typically involves (pre-)training models on collected, historical data and then applying them to unseen data points. However, in real-world applications, time series data streams are usually non-stationary and trained ML models usually, over time, face the problem of data or concept drift. To address this issue, models must be periodically retrained or redesigned, which takes significant human and computational resources. Additionally, historical data may not even exist to re-train or re-design model with. As a result, it is highly desirable that models are designed and trained in an online fashion. This work presents the Online NeuroEvolution-based Neural Architecture Search (ONE-NAS) algorithm, which is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks. Without any pre-training, ONE-NAS utilizes populations of RNNs that are continuously updated with new network structures and weights in response to new multivariate input data. ONE-NAS is tested on real-world, large-scale multivariate wind turbine data as well as the univariate Dow Jones Industrial Average (DJIA) dataset. Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods, including online linear regression, fixed long short-term memory (LSTM) and gated recurrent unit (GRU) models trained online, as well as state-of-the-art, online ARIMA strategies.
翻译:时间序列预测(TSF)是数据科学中最重要的任务之一,因为准确的时间序列(TS)预测模型在金融、交通、医疗保健和电力系统等多个领域发挥着关键作用。机器学习(ML)的实际应用通常涉及基于收集的历史数据对模型进行(预)训练,然后将其应用于未见过的数据点。然而,在实际应用中,时间序列数据流通常是非平稳的,且训练好的ML模型随着时间的推移常面临数据漂移或概念漂移的问题。为解决这一问题,模型必须定期重新训练或重新设计,这需要大量的人力和计算资源。此外,可能甚至没有历史数据可供模型重新训练或重新设计。因此,以在线方式设计和训练模型极具吸引力。本文提出基于在线神经进化的神经架构搜索(ONE-NAS)算法,这是一种新颖的神经架构搜索方法,能够自动设计并动态训练循环神经网络(RNNs)以完成在线预测任务。无需任何预训练,ONE-NAS利用不断更新的RNN种群,根据新的多变量输入数据同步调整网络结构和权重。ONE-NAS在实际大规模多变量风力涡轮机数据以及单变量道琼斯工业平均指数(DJIA)数据集上进行了测试。结果表明,ONE-NAS优于传统的统计时间序列预测方法,包括在线线性回归、固定长短期记忆网络(LSTM)和门控循环单元(GRU)在线训练模型,以及最先进的在线ARIMA策略。