Forecasting rare events in multivariate time-series data is challenging due to severe class imbalance, long-range dependencies, and distributional uncertainty. We introduce EVEREST, a transformer-based architecture for probabilistic rare-event forecasting that delivers calibrated predictions and tail-aware risk estimation, with auxiliary interpretability via attention-based signal attribution. EVEREST integrates four components: (i) a learnable attention bottleneck for soft aggregation of temporal dynamics; (ii) an evidential head for estimating aleatoric and epistemic uncertainty via a Normal--Inverse--Gamma distribution; (iii) an extreme-value head that models tail risk using a Generalized Pareto Distribution; and (iv) a lightweight precursor head for early-event detection. These modules are jointly optimized with a composite loss (focal loss, evidential NLL, and a tail-sensitive EVT penalty) and act only at training time; deployment uses a single classification head with no inference overhead (approximately 0.81M parameters). On a decade of space-weather data, EVEREST achieves state-of-the-art True Skill Statistic (TSS) of 0.973/0.970/0.966 at 24/48/72-hour horizons for C-class flares. The model is compact, efficient to train on commodity hardware, and applicable to high-stakes domains such as industrial monitoring, weather, and satellite diagnostics. Limitations include reliance on fixed-length inputs and exclusion of image-based modalities, motivating future extensions to streaming and multimodal forecasting.
翻译:多元时间序列数据中的罕见事件预测因严重的类别不平衡、长程依赖性和分布不确定性而具有挑战性。我们提出了EVEREST,一种基于Transformer的概率性罕见事件预测架构,它能够提供校准的预测和尾部感知的风险估计,并通过基于注意力的信号归因实现辅助可解释性。EVEREST集成了四个组件:(i) 一个可学习的注意力瓶颈,用于时间动态的软聚合;(ii) 一个证据性头部,用于通过正态-逆伽马分布估计偶然不确定性和认知不确定性;(iii) 一个极值头部,使用广义帕累托分布对尾部风险进行建模;(iv) 一个轻量级的前兆头部,用于早期事件检测。这些模块通过复合损失(焦点损失、证据性负对数似然和尾部敏感的极值理论惩罚项)联合优化,且仅在训练时起作用;部署时仅使用单个分类头部,无推理开销(约0.81M参数)。在长达十年的空间天气数据上,EVEREST在C级耀斑的24/48/72小时预测时间范围内,实现了0.973/0.970/0.966的最优真技巧统计量。该模型紧凑,可在商用硬件上高效训练,适用于工业监控、气象和卫星诊断等高风险领域。局限性包括对固定长度输入的依赖以及排除了基于图像的模态,这促使未来向流式和多模态预测方向扩展。