Although the applications of Non-Homogeneous Poisson Processes to model and study the threshold overshoots of interest in different time series of measurements have proven to provide good results, they needed to be complemented with an efficient and automatic diagnostic technique to establish the location of the change-points, which, when taken into account, make the estimated model fit poorly in regards of the information contained in the real model. For this reason, we propose a new method to solve the segmentation uncertainty of the time series of measurements, where the emission distribution of exceedances of a specific threshold is the focus of investigation. One of the great contributions of the present algorithm is that all the days that overflowed are candidates to be a change-point, so all the possible configurations of overflow days are the possible chromosomes, which will unite to have offspring. Under the heuristics of a genetic algorithm, the solution to the problem of finding such change points will be guaranteed to be non-local and the best possible one, reducing wasted machine time evaluating the least likely chromosomes to be a solution to the problem. The analytical evaluation technique will be by means of the Minimum Description Length (\textit{MDL}) as the objective function, which is the joint posterior distribution function of the parameters of each regime and the change points that determines them and which account as well for the influence of the presence of said times.
翻译:尽管非齐次泊松过程在建模和研究不同测量时间序列中感兴趣的阈值超限方面已被证明能够提供良好结果,但仍需辅以高效、自动的诊断技术,以确定变点的位置——若忽略这些变点,估计模型将与真实模型所含信息拟合不佳。为此,我们提出了一种新方法来解决测量时间序列的分割不确定性问题,其中特定阈值超限的发射分布是研究的核心。本算法的一大贡献在于,所有溢出日都被视为变点的候选点,因此所有可能的溢出日配置构成可能的染色体,它们将结合产生后代。在遗传算法的启发式框架下,寻找此类变点问题的解将被确保为非局部且最优的,从而减少因评估最不可能成为问题解的染色体而浪费的机器时间。分析评估技术将采用最小描述长度(MDL)作为目标函数,该函数是各状态段参数及决定这些状态段的变点的联合后验分布函数,同时也考虑了这些时间点存在的影响。