{Multi-criteria decision analysis in databases has been actively studied, especially through the Skyline operator. Yet, few approaches offer a relevant comparison of Pareto optimal, or Skyline, points for high cardinality result sets. We propose to improve the dp-idp method, inspired by tf-idf, a recent approach computing a score for each Skyline point, by introducing the concept of dominance hierarchy. As dp-idp lacks efficiency and does not ensure a distinctive rank, we introduce the RankSky method, the adaptation of Google's well-known PageRank solution, using a square stochastic matrix, a teleportation matrix, a damping factor, and then a row score eigenvector and the IPL algorithm. For the same reasons as RankSky, and also to offer directly embeddable in DBMS solution, we establish the TOPSIS based CoSky method, derived from both information research and multi-criteria analysis. CoSky automatically ponderates normalized attributes using the Gini index, then computes a score using Salton's cosine toward an ideal point. By coupling multilevel Skyline to dp-idp, RankSky or CoSky, we introduce DeepSky. Implementations of dp-idp, RankSky and CoSky are evaluated experimentally.
翻译:数据库中的多准则决策分析一直受到广泛研究,特别是通过Skyline算子。然而,对于高基数结果集,现有方法很少能对帕累托最优点(即Skyline点)提供有效的比较。受tf-idf启发,我们提出改进dp-idp方法——一种为每个Skyline点计算分值的最新方法,通过引入支配层级概念。由于dp-idf效率不足且无法保证排序的区分度,我们提出了RankSky方法,该方法借鉴了谷歌著名的PageRank解决方案,使用方阵随机矩阵、传送矩阵、阻尼因子,以及行分值特征向量和IPL算法。基于与RankSky相同的原因,同时也为了提供可直接嵌入数据库管理系统的解决方案,我们建立了基于TOPSIS的CoSky方法,该方法融合了信息检索与多准则分析。CoSky使用基尼指数自动加权归一化属性,然后通过Salton余弦计算相对于理想点的分值。通过将多层次Skyline与dp-idp、RankSky或CoSky相结合,我们提出了DeepSky。本文通过实验评估了dp-idp、RankSky和CoSky的实现效果。