Multi-criteria decision analysis in databases has been actively studied, especially through the Skyline operator. Yet, few approaches offer a relevant comparison of Pareto optimal, or Skyline, points for high cardinality result sets. We propose to improve the dp-idp method, inspired by tf-idf, a recent approach computing a score for each Skyline point, by introducing the concept of dominance hierarchy. As dp-idp does not ensure a distinctive rank, we introduce the TOPSIS based CoSky method, derived from both information research and multi-criteria analysis. CoSky, directly embeddable in DBMS, automatically ponderates normalized attributes using the Gini index, then computes a score using Salton's cosine toward an ideal point. By coupling multilevel Skyline to CoSky, we introduce DeepSky. CoSky and dp-idp implementations are evaluated experimentally.
翻译:多准则决策分析在数据库领域得到了广泛研究,其中Skyline算子尤为关键。然而,现有方法在处理高基数结果集时,难以对帕累托最优点(即Skyline点)进行有效比较。本文提出改进受tf-idf启发的dp-idp方法——通过引入支配层次概念,为每个Skyline点计算评分。针对dp-idf方法无法保证排序区分度的问题,我们融合信息检索与多准则分析技术,提出了基于TOPSIS的CoSky方法。该方法可直接嵌入数据库管理系统,利用基尼指数对归一化属性进行自动加权,并通过Salton余弦计算各点与理想点的接近度得分。通过将多层次Skyline与CoSky结合,我们进一步提出了DeepSky框架。实验部分对CoSky与dp-idp的实现进行了系统评估。