Although the applications of Non-Homogeneous Poisson Processes to model and study the threshold overshoots of interest in different time series of measurements have proven to provide good results, they needed to be complemented with an efficient and automatic diagnostic technique to establish the location of the change-points, which, when taken into account, make the estimated model fit poorly in regards of the information contained in the real model. For this reason, we propose a new method to solve the segmentation uncertainty of the time series of measurements, where the emission distribution of exceedances of a specific threshold is the focus of investigation. One of the great contributions of the present algorithm is that all the days that overflowed are candidates to be a change-point, so all the possible configurations of overflow days are the possible chromosomes, which will unite to have offspring. Under the heuristics of a genetic algorithm, the solution to the problem of finding such change points will be guaranteed to be non-local and the best possible one, reducing wasted machine time evaluating the least likely chromosomes to be a solution to the problem. The analytical evaluation technique will be by means of the Minimum Description Length (\textit{MDL}) as the objective function, which is the joint posterior distribution function of the parameters of each regime and the change points that determines them and which account as well for the influence of the presence of said times.
翻译:尽管非齐次泊松过程在不同测量时间序列中建模和研究感兴趣的阈值超出现象已被证明能取得良好结果,但仍需辅以高效自动的诊断技术来确定变点的位置。若未考虑这些变点,估计模型与实际模型所含信息相比拟合效果较差。为此,我们提出一种新方法来解决测量时间序列的分割不确定性问题,其中特定阈值超限的发射分布是研究的焦点。本算法的重要贡献之一在于:所有出现溢出的天数均被视为潜在的变点候选,因此所有可能的溢出天数配置构成可能的染色体,这些染色体将结合产生子代。在遗传算法的启发式框架下,求解此类变点检测问题的结果将保证为非局部最优解,且能最大限度地减少因评估最不可能成为解的染色体而浪费的机器时间。分析评估技术采用最小描述长度(MDL)作为目标函数,该函数是各区间参数与决定这些参数的变点的联合后验分布函数,同时也反映了这些时间点存在的影响。