Understanding data change is critical towards understanding trends, normal vs. abnormal behaviours, recognizing patterns, and the causes of change. Existing database systems have limited support for change management, relying on statistics, triggers, and constraints. Data quality rules model sequential changes along a restricted set of attributes, quantify change among unordered tuples, and have limited ability to model the context under which attribute changes occur. In this paper, we introduce Change Rules (CRs) that quantify the sequential changes among ordered tuples in both the antecedent and consequent attributes. CRs aim to address the limitations of existing declarative dependencies to support trend analysis and causal relationships that trigger change among attributes. We propose CR-Miner, an automated algorithm for CR discovery that generates candidate change intervals in a level-wise manner. Experimental results show that CR-Miner achieves an average runtime improvement of 40-50% over existing baselines.
翻译:理解数据变更对于洞悉趋势、区分正常与异常行为、识别模式以及分析变更成因至关重要。现有数据库系统对变更管理的支持有限,主要依赖统计信息、触发器和约束条件。数据质量规则仅能沿有限属性集建模序列变更,量化无序元组间的变更程度,且对属性变更发生的上下文建模能力不足。本文提出变更规则(Change Rules, CRs),该规则可量化有序元组在前项与后项属性上的序列变更。CRs旨在弥补现有声明式依赖的局限,以支持趋势分析及触发属性变更的因果关系探索。我们提出了CR-Miner算法,通过逐层生成候选变更区间实现CR的自动化发现。实验结果表明,与现有基准方法相比,CR-Miner平均运行效率提升40-50%。