Recently, order-preserving pattern (OPP) mining has been proposed to discover some patterns, which can be seen as trend changes in time series. Although existing OPP mining algorithms have achieved satisfactory performance, they discover all frequent patterns. However, in some cases, users focus on a particular trend and its associated trends. To efficiently discover trend information related to a specific prefix pattern, this paper addresses the issue of co-occurrence OPP mining (COP) and proposes an algorithm named COP-Miner to discover COPs from historical time series. COP-Miner consists of three parts: extracting keypoints, preparation stage, and iteratively calculating supports and mining frequent COPs. Extracting keypoints is used to obtain local extreme points of patterns and time series. The preparation stage is designed to prepare for the first round of mining, which contains four steps: obtaining the suffix OPP of the keypoint sub-time series, calculating the occurrences of the suffix OPP, verifying the occurrences of the keypoint sub-time series, and calculating the occurrences of all fusion patterns of the keypoint sub-time series. To further improve the efficiency of support calculation, we propose a support calculation method with an ending strategy that uses the occurrences of prefix and suffix patterns to calculate the occurrences of superpatterns. Experimental results indicate that COP-Miner outperforms the other competing algorithms in running time and scalability. Moreover, COPs with keypoint alignment yield better prediction performance.
翻译:近年来,保序模式(OPP)挖掘被提出用于发现时间序列中可视为趋势变化的模式。尽管现有OPP挖掘算法已取得满意性能,但它们会挖掘所有频繁模式。然而在某些场景下,用户仅关注特定趋势及其关联趋势。为高效发现与特定前缀模式相关的趋势信息,本文针对共现保序模式挖掘(COP)问题,提出名为COP-Miner的算法,用于从历史时间序列中发现共现保序模式。该算法由三部分组成:关键点提取、准备阶段、以及迭代计算支持度与挖掘频繁共现模式。关键点提取用于获取模式与时间序列的局部极值点;准备阶段旨在为首次挖掘做准备,包含四个步骤:获取关键点子序列的后缀保序模式、计算后缀保序模式出现次数、验证关键点子序列出现次数、以及计算关键点子序列所有融合模式出现次数。为进一步提升支持度计算效率,我们提出带结束策略的支持度计算方法,该方法利用前缀与后缀模式的出现次数来计算超模式出现次数。实验结果表明,COP-Miner在运行时间和可扩展性上均优于其他对比算法。此外,基于关键点对齐的共现保序模式具有更优的预测性能。