Time Series Forecasting (TSF) is a widely researched topic with broad applications in weather forecasting, traffic control, and stock price prediction. Extreme values in time series often significantly impact human and natural systems, but predicting them is challenging due to their rare occurrence. Statistical methods based on Extreme Value Theory (EVT) provide a systematic approach to modeling the distribution of extremes, particularly the Generalized Pareto (GP) distribution for modeling the distribution of exceedances beyond a threshold. To overcome the subpar performance of deep learning in dealing with heavy-tailed data, we propose a novel framework to enhance the focus on extreme events. Specifically, we propose a Deep Extreme Mixture Model with Autoencoder (DEMMA) for time series prediction. The model comprises two main modules: 1) a generalized mixture distribution based on the Hurdle model and a reparameterized GP distribution form independent of the extreme threshold, 2) an Autoencoder-based LSTM feature extractor and a quantile prediction module with a temporal attention mechanism. We demonstrate the effectiveness of our approach on multiple real-world rainfall datasets.
翻译:时间序列预测(TSF)是一个广泛研究的课题,在天气预报、交通控制和股票价格预测等领域具有广泛应用。时间序列中的极端值往往对人类和自然系统产生重大影响,但由于其罕见性,预测这些极端值具有挑战性。基于极值理论(EVT)的统计方法提供了一种系统化的方法来建模极值的分布,特别是利用广义帕累托(GP)分布来建模超过阈值的极端值分布。为了克服深度学习在处理重尾数据方面的性能不足,我们提出了一种新颖的框架,以增强对极端事件的关注。具体而言,我们提出了一种基于自编码器的深度极值混合模型(DEMMA)用于时间序列预测。该模型包含两个主要模块:1)基于Hurdle模型和与极值阈值无关的重参数化GP分布形式的广义混合分布;2)基于自编码器的LSTM特征提取器以及带有时间注意力机制的分位数预测模块。我们在多个真实世界的降雨数据集上验证了我们方法的有效性。