Since data uncertainty is inherent in multi-criteria decision making, recent years have witnessed a dramatically increasing amount of attention devoted to conducting advanced analysis on uncertain data. In this paper, we revisit restricted skyline query on uncertain datasets from both complexity and algorithm perspective. Instead of conducting probabilistic restricted skyline analysis under threshold or top-$k$ semantics, we focus on a more general problem that aims to compute the restricted skyline probability of all objects. We prove that the problem can not be solved in truly subquadratic-time unless the Orthogonal Vectors conjecture fails, and propose two algorithms, one with near-optimal time complexity and the other with better expected time complexity. We also propose an algorithm with sublinear query time and polynomial preprocessing time for the case where the preference region is described by $d - 1$ ratio bound constraints. Our thorough experiments over real and synthetic datasets demonstrate the effectiveness of the problem and the efficiency of the proposed algorithms.
翻译:由于数据不确定性是多准则决策中的固有特性,近年来针对不确定数据进行高级分析的研究受到了日益增长的关注。本文从复杂度和算法两个角度重新审视了不确定数据集上的受限天际线查询问题。我们未采用阈值或top-$k$语义下的概率受限天际线分析,而是聚焦于一个更通用的问题:计算所有对象的受限天际线概率。我们证明该问题无法在真次二次时间内解决,除非正交向量猜想不成立;并提出两种算法,一种具有近似最优时间复杂度,另一种具有更优的期望时间复杂度。针对偏好区域由$d - 1$个比例约束边界描述的情形,我们还提出了一种具有亚线性查询时间及多项式预处理时间的算法。基于真实与合成数据的全面实验证明了该问题的有效性及所提算法的高效性。