Restricted skyline (rskyline) query is widely used in multi-criteria decision making. It generalizes the skyline query by additionally considering a set of personalized scoring functions F. Since uncertainty is inherent in datasets for multi-criteria decision making, we study rskyline queries on uncertain datasets from both complexity and algorithm perspective. We formalize the problem of computing rskyline probabilities of all data items and show that no algorithm can solve this problem in truly subquadratic-time, unless the orthogonal vectors conjecture fails. Considering that linear scoring functions are widely used in practical applications, we propose two efficient algorithms for the case where $\calF$ is a set of linear scoring functions whose weights are described by linear constraints, one with near-optimal time complexity and the other with better expected time complexity. For special linear constraints involving a series of weight ratios, we further devise an algorithm with sublinear query time and polynomial preprocessing time. Extensive experiments demonstrate the effectiveness, efficiency, scalability, and usefulness of our proposed algorithms.
翻译:受限天际线(rskyline)查询广泛应用于多准则决策中。它通过额外考虑一组个性化评分函数F来推广天际线查询。由于多准则决策数据集本身存在不确定性,我们从复杂度和算法两个角度研究了不确定数据集上的rskyline查询问题。我们形式化了计算所有数据项rskyline概率的问题,并证明除非正交向量猜想不成立,否则任何算法都无法在真正次二次时间内解决该问题。考虑到线性评分函数在实际应用中的广泛使用,我们针对$\calF$为一组由线性约束描述权重的线性评分函数的情况,提出了两种高效算法:一种具有近最优时间复杂度,另一种具有更优的期望时间复杂度。针对涉及一系列权重比值的特殊线性约束,我们进一步设计了一种具有次线性查询时间和多项式预处理时间的算法。大量实验证明了所提算法的有效性、高效性、可扩展性和实用性。