Modelling and forecasting the occurrence of extreme events is especially difficult when the event process is nonstationary, with changes in both the rate at which extremes occur and the magnitude of the extremes when they occur. We approach this task by developing a Bayesian point process model for extreme events, which uses a self-exciting Hawkes process to model the rate at which extremes occur. The Hawkes process has a structure which allows events to occur in clusters, making it realistic for many types of data. We use a flexible Bayesian nonparametric approach based on the Dirichlet process to learn the temporal excitation pattern from the data. Further, we build on Extreme Value Theory by using a Generalised Pareto Distribution (GPD) to model the magnitudes of the extremes, with a hierarchical mark model allowing these magnitudes to vary across Hawkes-induced clusters. A hierarchical specification of the model results in partial pooling, allowing for more accurate GPD estimation even in clusters with only a small number of observations. We develop an MCMC algorithm to sample from the resulting hierarchical model. A simulation study confirms that the two flexible components improve prediction when the corresponding features are present in the data-generating mechanism, and across four real data sets the nonparametric Hawkes model with hierarchical GPD marks gives the best held-out predictive performance among the model variants considered.
翻译:当事件过程呈现非平稳性,即极端事件发生率和发生幅度均随时间变化时,对极端事件的建模和预测尤为困难。我们通过构建极端事件的贝叶斯点过程模型来处理这一问题,该模型采用自激励Hawkes过程对极端事件的发生率进行建模。Hawkes过程的结构允许事件以聚类形式发生,使其适用于多种数据类型。我们基于狄利克雷过程采用灵活的贝叶斯非参数方法,从数据中学习时间激发模式。此外,我们基于极值理论,使用广义帕累托分布(GPD)对极端事件的幅度进行建模,并采用层次标记模型使这些幅度在不同Hawkes诱导的聚类中变化。模型的层次化规范实现了部分池化,即使在观测数量较少的聚类中也能提高GPD估计的准确性。我们开发了一种MCMC算法来从生成的层次模型中进行采样。模拟研究证实,当数据生成机制中存在相应特征时,这两个灵活组件能改进预测效果;在四个真实数据集上,非参数Hawkes模型搭配层次GPD标记在所考虑的模型变体中取得了最佳的留出预测性能。