Order-preserving pattern (OPP) mining is a type of sequential pattern mining method in which a group of ranks of time series is used to represent an OPP. This approach can discover frequent trends in time series. Existing OPP mining algorithms consider data points at different time to be equally important; however, newer data usually have a more significant impact, while older data have a weaker impact. We therefore introduce the forgetting mechanism into OPP mining to reduce the importance of older data. This paper explores the mining of OPPs with forgetting mechanism (OPF) and proposes an algorithm called OPF-Miner that can discover frequent OPFs. OPF-Miner performs two tasks, candidate pattern generation and support calculation. In candidate pattern generation, OPF-Miner employs a maximal support priority strategy and a group pattern fusion strategy to avoid redundant pattern fusions. For support calculation, we propose an algorithm called support calculation with forgetting mechanism, which uses prefix and suffix pattern pruning strategies to avoid redundant support calculations. The experiments are conducted on nine datasets and 12 alternative algorithms. The results verify that OPF-Miner is superior to other competitive algorithms. More importantly, OPF-Miner yields good clustering performance for time series, since the forgetting mechanism is employed.
翻译:保序模式(OPP)挖掘是一种序列模式挖掘方法,其中使用时间序列的秩次组来表示OPP。该方法能够发现时间序列中的频繁趋势。现有的OPP挖掘算法认为不同时间点的数据点同等重要;然而,较新的数据通常具有更显著的影响,而较旧的数据影响较弱。因此,我们将遗忘机制引入OPP挖掘,以降低旧数据的重要性。本文探讨了基于遗忘机制的保序模式(OPF)挖掘,并提出了一种名为OPF-Miner的算法,该算法能够发现频繁的OPF。OPF-Miner执行两个任务:候选模式生成和支持度计算。在候选模式生成中,OPF-Miner采用最大支持度优先策略和组模式融合策略,以避免冗余的模式融合。对于支持度计算,我们提出了一种名为基于遗忘机制的支持度计算算法,该算法使用前缀和后缀模式剪枝策略,以避免冗余的支持度计算。实验在九个数据集和十二种替代算法上进行。结果验证了OPF-Miner优于其他竞争算法。更重要的是,由于采用了遗忘机制,OPF-Miner在时间序列聚类方面表现出良好的性能。