Recently, order-preserving pattern (OPP) mining, a new sequential pattern mining method, has been proposed to mine frequent relative orders in a time series. Although frequent relative orders can be used as features to classify a time series, the mined patterns do not reflect the differences between two classes of time series well. To effectively discover the differences between time series, this paper addresses the top-k contrast OPP (COPP) mining and proposes a COPP-Miner algorithm to discover the top-k contrast patterns as features for time series classification, avoiding the problem of improper parameter setting. COPP-Miner is composed of three parts: extreme point extraction to reduce the length of the original time series, forward mining, and reverse mining to discover COPPs. Forward mining contains three steps: group pattern fusion strategy to generate candidate patterns, the support rate calculation method to efficiently calculate the support of a pattern, and two pruning strategies to further prune candidate patterns. Reverse mining uses one pruning strategy to prune candidate patterns and consists of applying the same process as forward mining. Experimental results validate the efficiency of the proposed algorithm and show that top-k COPPs can be used as features to obtain a better classification performance.
翻译:近日,一种新的序贯模式挖掘方法——保序模式(OPP)挖掘被提出,用于挖掘时间序列中的频繁相对顺序。尽管频繁相对顺序可作为特征用于时间序列分类,但挖掘出的模式无法良好反映两类时间序列间的差异。为有效发现时间序列间的差异性,本文研究top-k对比保序模式(COPP)挖掘问题,提出COPP-Miner算法以挖掘top-k对比模式作为时间序列分类特征,从而避免参数设置不当的问题。COPP-Miner由三部分组成:极值点提取以压缩原始时间序列长度、正向挖掘和逆向挖掘以发现COPP。正向挖掘包含三个步骤:组模式融合策略生成候选模式、支持度计算方法高效计算模式支持度、以及两种剪枝策略进一步剪枝候选模式。逆向挖掘采用一种剪枝策略剪枝候选模式,并应用与正向挖掘相同的过程。实验结果验证了所提算法的效率,并表明top-k对比保序模式可作为特征获得更优的分类性能。