We introduce a novel noisy sorting model motivated by the Just Noticeable Difference (JND) model from experimental psychology. The goal of our model is to capture the low quality of the data that are collected from crowdsourcing environments. Compared to other celebrated models of noisy sorting, our model does not rely on precise data-generation assumptions and captures crowdsourced tasks' varying levels of difficulty that can lead to different amounts of noise in the data. To handle this challenging task, we assume we can verify some of the collected data using expert advice. This verification procedure is costly; hence, we aim to minimize the number of verifications we use. We propose a new efficient algorithm called CandidateSort, which we prove uses the optimal number of verifications in the noisy sorting models we consider. We characterize this optimal number of verifications by showing that it is linear in a parameter $k$, which intuitively measures the maximum number of comparisons that are wrong but not inconsistent in the crowdsourcing data.
翻译:我们提出了一种受实验心理学中恰可察觉差异(JND)模型启发的全新噪声排序模型。该模型旨在捕捉众包环境下收集数据的低质量特征。相较于其他著名的噪声排序模型,我们的模型不依赖精确的数据生成假设,并能反映众包任务中导致数据噪声程度不同的难度差异。为应对这一挑战,我们假设可利用专家意见对部分收集数据进行验证。该验证过程代价高昂,因此我们致力于最小化验证次数。我们提出一种名为CandidateSort的高效算法,并证明该算法在我们考虑的噪声排序模型中实现了最优验证次数。我们通过证明该最优验证次数与参数$k$呈线性关系来刻画其特性,其中$k$直观度量了众包数据中错误但不矛盾比较的最大数量。